How do you handle production webhook delivery reliability in your apps?

7•Tanjim•7mo ago

Hey everyone,

I’ve been thinking a lot about webhook delivery reliability lately. In many projects I’ve worked on, building robust webhook infra turned out to be deceptively complex:

- Retry logic (exponential backoff, timeouts) - Handling non-2xx responses - Delivery monitoring and alerting - Back-pressure or queueing to avoid overwhelming receivers - Secure signing and validation flows

In one project, a failed webhook caused a payment processing delay for hours because the retry logic was buggy. Another time, burst traffic took down the receiver endpoint with no DLQ strategy in place.

I’ve been researching different approaches teams here use:

Do you build your own custom webhook delivery queue and monitoring system? Use cloud solutions like AWS EventBridge or Step Functions to orchestrate? Or integrate third-party tools that handle delivery, retries, and observability for you?

I’m curious about how you ensure production-grade reliability at scale without burning dev hours on plumbing. Recently, I’ve been working on a tool in this space to handle these issues automatically, but would love to hear:

- What architecture have you found most reliable? - What are the edge cases you’ve encountered (e.g. signature mismatches, downstream outages)? - Any horror stories or lessons learned from webhook failures in production?

Looking forward to learning from your experiences and best practices around webhook infra!

Comments

tasn•7mo ago

Very biased, but I think you should just use Svix[1].

Though if you're interested, I recorded a video about webhook architecture at some point you may find useful: https://m.youtube.com/watch?v=4jvV75OD620

1: https://www.svix.com

kasey_junk•7mo ago

I’m not affiliated with svix but a happy customer.

It’s just worked, for years for us in production. We’ve never had an issue.

Now our use case is pretty simple but for us it’s a piece of infrastructure we never worry about.

leakycap•7mo ago

I think your questions beg another: where can we just take out this layer of complexity, and how?

Sometimes rather than chasing edge cases, I find another way to do the same thing using a routine or library that already has all the edge cases ironed out.

If you're a small team or one person, you can't expect to stay on top of something that starts broken.

ezekg•7mo ago

Totally agree. For me, with a vanilla Rails app, I leaned on Sidekiq to handle webhook queueing, processing, and retries: https://keygen.sh/blog/how-to-build-a-webhook-system-in-rail...

It's scaled quite well. Billions of webhooks. I barely ever think about it.

LLMs are powerful, but enterprises are deterministic by nature

Ask HN: Anyone Using a Mac Studio for Local AI/LLM?

Ask HN: Ideas for small ways to make the world a better place

Ask HN: Non AI-obsessed tech forums

Ask HN: 10 months since the Llama-4 release: what happened to Meta AI?

Ask HN: Non-profit, volunteers run org needs CRM. Is Odoo Community a good sol.?

Ask HN: Who wants to be hired? (February 2026)

AI Regex Scientist: A self-improving regex solver

Ask HN: Who is hiring? (February 2026)

Tell HN: Another round of Zendesk email spam

Ask HN: Is Connecting via SSH Risky?

Ask HN: Has your whole engineering team gone big into AI coding? How's it going?

Ask HN: Why LLM providers sell access instead of consulting services?

Ask HN: What is the most complicated Algorithm you came up with yourself?

Ask HN: How does ChatGPT decide which websites to recommend?

Ask HN: Is it just me or are most businesses insane?

Ask HN: Any International Job Boards for International Workers?

Ask HN: Mem0 stores memories, but doesn't learn user patterns

Ask HN: Is there anyone here who still uses slide rules?

Kernighan on Programming

Ask HN: Anyone Seeing YT ads related to chats on ChatGPT?

Ask HN: Does global decoupling from the USA signal comeback of the desktop app?

We built a serverless GPU inference platform with predictable latency

Ask HN: How Did You Validate?

Ask HN: Does a good "read it later" app exist?

Ask HN: Have you been fired because of AI?

Ask HN: Cheap laptop for Linux without GUI (for writing)

Ask HN: Anyone have a "sovereign" solution for phone calls?

Test management tools for automation heavy teams

Ask HN: OpenClaw users, what is your token spend?