frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Launch HN: Sentrial (YC W26) – Catch AI Agent Failures Before Your Users Do

https://www.sentrial.com/
9•anayrshukla•1h ago
Hey HN! We're Neel and Anay, and we’re building Sentrial (https://sentrial.com). It’s production monitoring for AI products. We automatically detect failure patterns: loops, hallucinations, tool misuse, and user frustrations the moment they happen. When issues surface, Sentrial diagnoses the root cause by analyzing conversation patterns, model outputs, and tool interactions, then recommends specific fixes.

Here's a demo if you're interested: https://www.youtube.com/watch?v=cc4DWrJF7hk. When agents fail, choose wrong tools, or blow cost budgets, there's no way to know why - usually just logs and guesswork. As agents move from demos to production with real SLAs and real users, this is not sustainable.

Neel and I lived this, building agents at SenseHQ and Accenture where we found that debugging agents was often harder than actually building them. Agents are untrustworthy in prod because there’s no good infrastructure to verify what they’re actually doing.

In practice this looks like: - A support agent that began misclassifying refund requests as product questions, which meant customers never reached the refund flow. - A document drafting agent that would occasionally hallucinate missing sections when parsing long specs, producing confident but incorrect outputs. There’s no stack trace or 500 error and you only figure this out when a customer is angry.

We both realized teams were flying blind in production, and that agent native monitoring was going to be foundational infrastructure for every serious AI product. We started Sentrial as a verification layer designed to take care of this.

How it works: You wrap your client with our SDK in only a couple of lines. From there, we detect drift for you: - Wrong tool invocations - Misunderstood intents - Hallucinations - Quality regressions over time. You see it on our platform before a customer files a ticket.

There’s a quick mcp set up, just give claude code: claude mcp add --transport http Sentrial https://www.sentrial.com/docs/mcp

We have a free tier (14 days, no credit card required). We’d love any feedback from anyone running agents whether they be for personal use or within a professional setting.

We’ll be around in the comments!

Comments

rajit•1h ago
How do you identify "wrong tool" invocations (how is the "wrong tool" defined)?
anayrshukla•1h ago
Good question. We don’t define “wrong tool” in some universal way, because that really depends on the workflow.

What we do in practice is let the team mark a few tool calls as right or wrong in context, then use that to learn the pattern for that agent. From there, we can flag similar cases automatically by looking at the convo state, the tool chosen, the arguments, and what happened next.

So we’re learning what “correct” looks like for your workflow and then catching repeats of the same kind of mistake.

BoorishBears•1h ago
I know your homepage isn't your business, but I'm bet Claude could fix the janky horizontal overflow on mobile in a prompt. Makes for a very distracting read
anayrshukla•1h ago
Will fix ASAP.
claudeomusic•1h ago
Agreed - fix fast. No way to take a tool seriously about taking care of production that has such a blatant production issue

Show HN: Slate – Open-source AI workspace with a built-in browser

https://github.com/slate-ai/slate
1•meteor333•19s ago•0 comments

Wayfair boosts catalog accuracy and support speed with OpenAI

https://openai.com/index/wayfair
1•surprisetalk•42s ago•0 comments

Medical technology company in Michigan hit by suspected Iran-linked cyberattack

https://www.fox17online.com/news/local-news/kzoo-bc/kalamazoo/stryker-headquarters-in-portage-clo...
2•SteveNuts•58s ago•0 comments

Fitting a Query Engine in Three Cache Lines

https://vertexclique.com/blog/fitting-a-query-engine-in-three-cache-lines/
1•brkydnc•2m ago•0 comments

Free, private social media with Obsidian and Dropbox

https://clairefro.dev/blog/p/free-private-social-media-with-obsidian-and-dropbox
2•marjipan200•2m ago•0 comments

An interactive presentation about the Grammar of Graphic

https://timeplus-io.github.io/gg-vistral-introduction/
1•gangtao•3m ago•0 comments

MkLinux

https://en.wikipedia.org/wiki/MkLinux
1•tosh•4m ago•0 comments

Counting tokens to measure AI ROI is like judging a salesperson by emails sent

https://www.revenium.ai/post/stop-measuring-ai-in-tokens
1•nathanowahl•5m ago•0 comments

Given AI, should I still consider becoming a computer programmer? – Yes, and

https://htmx.org/essays/yes-and/
1•spiffyk•7m ago•0 comments

Hacker broke into FBI and compromised Epstein files

https://techcrunch.com/2026/03/11/hacker-broke-into-fbi-and-compromised-epstein-files-report-says/
7•SilverElfin•7m ago•0 comments

Processed 4M threads, got 200k user pain, comment ur idea, I will validate free

1•losalah•8m ago•0 comments

Crash Course in Deep Learning (For Computer Graphics)

https://gpuopen.com/learn/deep_learning_crash_course/
1•ppew•9m ago•0 comments

Help Vampires: A Spotter's Guide

https://slash7.com/2006/12/22/vampires/
2•Cider9986•11m ago•1 comments

Sidekiq in the Terminal

https://www.mikeperham.com/2026/03/10/sidekiq-in-the-terminal/
1•thunderbong•11m ago•0 comments

The Wiki

http://zqktlwiuavvvqqt4ybvgvi7tyo4hjl5xgfuvpdf6otjiycgwqbym2qad.onion/
2•davidian112•12m ago•2 comments

NASA's Dart Mission Changed Orbit of Asteroid Didymos Around Sun

https://www.nasa.gov/missions/dart/nasas-dart-mission-changed-orbit-of-asteroid-didymos-around-sun/
2•mpweiher•12m ago•0 comments

Everything Was Rational and Nothing Vibed

https://kevinmunger.substack.com/p/everything-was-rational-and-nothing-67d
1•jllyhill•12m ago•0 comments

XR Fragments

https://xrfragment.org/doc/RFC_XR_Fragments.html
1•andybak•13m ago•0 comments

Uggly Belgian Houses

https://www.tumblr.com/uglybelgianhouses/48112450208/dude-your-house-is-melting
1•tcumulus•15m ago•1 comments

Show HN: Making Debates Great Again

https://www.superdebate.org/
1•TheAntiEgo•19m ago•0 comments

TweetyBERT parses canary songs to better understand how brains learn language

https://techxplore.com/news/2026-03-tweetybert-parses-canary-songs-brains.html
1•PaulHoule•19m ago•0 comments

Antidote

https://newsletter.vickiboykis.com/archive/antidote/
1•exolymph•19m ago•0 comments

FBI warns Iran aspired to attack California with drones in retaliation for war

https://abcnews.com/US/fbi-warns-iran-aspired-attack-california-drones-retaliation/story?id=13097...
1•jaredwiener•20m ago•0 comments

IEA to carry out largest ever oil stock release amid market disruptions

https://www.iea.org/news/iea-member-countries-to-carry-out-largest-ever-oil-stock-release-amid-ma...
1•geox•22m ago•0 comments

Marine Hose Cartel (2014)

https://en.wikipedia.org/wiki/Parker_ITR_Srl_v_Commission
1•rglover•22m ago•0 comments

Repotype – Linting for your repository and to clean your Agent's rooms'

https://supernalintelligence.github.io/repotype/
1•supernalai•22m ago•1 comments

LaneConductor – Gemini conductor and Claude Code superpowers meets on Kanban

https://github.com/meller/laneconductor
1•meller_a•23m ago•1 comments

Re: Is Lutris Slop Now

https://github.com/lutris/lutris/issues/6529
2•yamabiko•24m ago•1 comments

Output-Maximizing Long-Context Programming: 14k lines of code for $0.58

https://zenodo.org/records/18963411
1•JasonViviers•24m ago•1 comments

Douglas Adams would love NIS2

https://www.heise.de/en/news/Douglas-Adams-would-love-NIS2-11204397.html
1•doener•26m ago•0 comments