So I built a Slack bot to fix my own workflow. Figured I'd share it.
What it does:
"/incident start sev2 API latency spike" creates a dedicated channel, invites whoever's on-call, pins the details, and starts recording a timeline. When you run "/incident resolve", it uses GPT-4 to analyze the entire channel conversation and generate a postmortem draft: summary, root cause, event timeline, action items.
The key insight: the actual diagnosis usually happens in casual messages ("wait, I think the connection pool is exhausted") not in formal status updates. So the AI reads everything, not just what was labeled important.
Stack: - TypeScript + Slack Bolt - Prisma + Postgres - OpenAI API for postmortem generation - PagerDuty integration for escalations
Other stuff it handles: - Update severity of an incident with "/incident severity <sev1|sev2|sev3|sev4>" - On-call scheduling with automatic weekly/daily rotation - Paging with escalation chains (Slack DMs → PagerDuty if configured) - Jira ticket creation for incidents within slack with "/incident ticket <title>" - Basic analytics (incidents per on-call, MTTR)
What I learned building this: 1. Slack's API is actually pretty good now. The Bolt framework handles most of the OAuth/event subscription pain. 2. Getting AI to write useful postmortems required being very explicit about event types. Without context about what's a "status update" vs a "debug message," it would hallucinate causes. 3. On-call scheduling is surprisingly complex. Timezone handling, rotation boundaries, handoff notifications, each is a rabbit hole.
Honest limitations: - Only works for teams already living in Slack - AI postmortems need human review, it can miss context from calls/video chats - Only a couple of integrations (the ones I use, but can add more, like Linear, github issues, etc...)
Code isn't open source (yet?), but happy to answer architecture questions. Been running this with my own team for approx. 2 months.
Landing page: https://incidentops.io
Would appreciate feedback, especially from SREs who've built similar internal tools. What am I missing?