frontpage.

Show HN: An OTel exporter that posts the cause to your incident channel

https://incidentary.com/

2•ahmedmostafa16•1h ago

Hi HN, Ahmed here. If you already run an OpenTelemetry collector, this is the entire install:

  exporters:
    otlp/incidentary:
      endpoint: api.incidentary.com:4317
      headers:
        authorization: "Bearer ${INCIDENTARY_API_KEY}"

  service:
    pipelines:
      traces:
        exporters: [otlp/incidentary]

What it does: when an alert fires (PagerDuty, OpsGenie, a Slack command, or a direct webhook), it takes the trace data your services already emitted in the window around the alert and assembles a causal chain like "service-A's HTTP call to service-B returned 503 at 14:22:03; service-B was timing out on Redis; Redis primary was failing over." That artifact lands in the incident channel before the war room fills up.

I built this because every incident I was on opened the same way: a team of engineers opening different tools and coming up with different theories. Datadog had the trace data, Sentry had the errors, Slack had the channel, PagerDuty fired the alert. Nothing stitched them into "what failed first, and what called what." Incidentary does that one thing and nothing else.

Why it isn't a Datadog Watchdog clone:

- Deterministic, not probabilistic. Every edge is proven by an actual parent_ce_id or W3C traceparent in the message envelope. If a service in the path wasn't instrumented, that link appears as a labeled gap, not filled in by a model.

- No LLM in the assembly path. The artifact is identical on a re-run; you can paste it into an RCA without retracting a sentence later.

- Pre-alert capture. The SDKs and the collector processor we ship hold an unsampled rolling window. When error rate, p99, or queue depth increases, the window elevates to full detail before the page fires, so you see the lead-up, not just the aftermath.

- Cluster ground-truth via the K8s operator. OOMs, evictions, HPA scale events. Your application telemetry never sees these. They union onto the same trace by service+time, not by W3C trace context (which most cluster events don't propagate).

If you are on dd-trace and don't run a separate OTel collector: dd-trace v1+ has built-in OTLP export. One env-var flip and you're dual-shipping to Datadog and to us. Or run our Docker sidecar in front of the dd-agent.

Quickstart: https://incidentary.com/docs/quickstart

Free plan: 1,000,000 traces/month, 14-day retention, no credit card. Pricing is per trace, not per span. A 3-service request with two downstream calls is one trace. Most sub-10-service teams stay on free.

Live artifact, no signup, real synthetic incident assembled by the same engine that runs in production, not a video: https://incidentary.com/demo

Pushback I would value:

1. The dd-trace dual-export path. A lot of you run Datadog APM and nothing else. If the env-var flip doesn't survive a real production dd-trace setup, that is the installation path I most need to fix this week. Tell me where it breaks; I would rather hear it from HN than from a user who churns silently.

2. The deterministic-only stance against the AI-Ops wave. I am betting "no hallucinations and you can paste this into an RCA" is worth more than what an LLM can guess from spans. The market is voting differently this year, and I want the strongest case for why I am wrong.

If your collector refuses the exporter, drop the YAML in a reply and I will debug it in the thread. Easier than a support ticket and you get the answer in public.

A Dangerous New Attack on Press Freedom

Net May 15 Starship • Flight 12

AWS EC2 outage in use1-az4 (us-east-1)

6 years of CS2 skin market data, indexed S&P-style (open methodology)

The Long Journey from the Strait of Hormuz to the Gas Tank

Yarbo Nat in My Backyard

UBC, SFU among universities affected by Canvas software cyber breach

GPT-5.5 Price Increase: What It Costs

OpenAI end of lifes fine-tuning

Pentagon CTO demonstrates Palantir's Maven system, used for military operations [video]

Netflix tests its own AI-powered voice search

The IDE Should Become an Operating System for AI

New open source city-state, with new constitution functional on one site

The first repo with 500k+ stars

How Does decoding="async" Affect LCP?

Seeing Birdsong

Eradicating Batch Effects and Enabling Cross-Species Zero-Shot Oncology

Show HN: Selvedge – an MCP server that captures why AI agents change code

Show HN: I had a random domain and made a thing

Show HN: Armorer – A secure local control plane for AI agents

Trump administration cut funding to study hantavirus

"Surface" a Governed AI-Agentic Surface

Subjective: Building a Native VFX Editor with Agentic Coding

Mistral Medium 3.5 Is Now Available in Puter.js

What Happened on the Hantavirus Cruise, According to a Doctor on Board

The AI Ad-Hoc Prior Restraint Era Begins

The all-new Fitbit Air

Chrome's 4GB AI Surprise: Why Google Chrome Is Quietly Downloading Gemini Nano

Show HN: Vibe coding embedded systems with simulators

Tributes paid to David Attenborough on his 100th birthday