frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Agentic QA – Open-source middleware to fuzz-test agents for loops

39•Saurabh_Kumar_•2mo ago
I built this because I watched my LangChain agent burn ~$50 in OpenAI credits overnight due to an infinite loop.

It's a middleware API that acts as a 'Flight Simulator'. You send it your agent's prompt, and it runs adversarial attacks (Red Teaming) to catch loops and PII leaks before deployment.

Code & Repo: https://github.com/Saurabh0377/agentic-qa-api Live Demo: https://agentic-qa-engine.onrender.com/docs

Would love feedback on other failure modes you've seen!

Comments

Saurabh_Kumar_•2mo ago
HN, OP here. I built this because I recently watched my LangChain agent burn through ~$50 of OpenAI credits overnight. It got stuck in a semantic infinite loop (repeating "I am checking..." over and over) which my basic max_iterations check didn't catch because the phrasing was slightly different each time. Realizing that "Pre-Flight" testing for agents is surprisingly hard, I built a small middleware API (FastAPI + LangChain) to automate this. What it does: It acts as an adversarial simulator. You send it your agent's system prompt, and it spins up a 'Red Team' LLM to attack it. Currently checks for: Infinite Loops: Semantic repetition detection. PII Leaks: Attempts social engineering ('URGENT AUDIT') to force the agent to leak fake PII, then checks if it gets blocked. Prompt Injection: Basic resistance checks. Tech Stack: Python, FastAPI, Supabase (for logs). It's open-source and I hosted a live instance on Render if you want to try curl it without installing: https://agentic-qa-api.onrender.com/docs Would love feedback on what other failure modes you've seen your agents fall into!
esafak•2mo ago
1. This is premature to share. I'm not going to pull in a dependency for something so trivial: https://github.com/Saurabh0377/agentic-qa-api/blob/main/main...

2. Keep the comments in English.

giancarlostoro•2mo ago
I had Claude Code losing its mind because of something outside of its control, one of the formatters used by Zed for Python kept messing with HTML templates, which are insanely sensitive to line breaks in some template specific code statements. Zed kept adding line breaks without reason other than some tool just did it. Claude kept trying to fix it, going to the extreme of using ed to force it, I watched it lose its mind till I asked "I think Zed is formatting the file every time you save?" turns out, yes, yes it was. It wasn't an issue when it used ed, but when Claude or I would change the file again, it would become an issue again.

I don't know what could have saved me, maybe .current_editor should be a file that your agents instructions.md file imports, and your editor updates it, to give Claude context about your tooling.

khannn•2mo ago
Couldn't even keep an em dash out of the title

BOOOOO

mikigraf•2mo ago
Almost thought you found my startup AgenticQA.eu

Why E cores make Apple silicon fast

https://eclecticlight.co/2026/02/08/last-week-on-my-mac-why-e-cores-make-apple-silicon-fast/
54•ingve•2h ago•24 comments

DoNotNotify is now Open Source

https://donotnotify.com/opensource.html
238•awaaz•5h ago•43 comments

Show HN: Fine-tuned Qwen2.5-7B on 100 films for probabilistic story graphs

https://cinegraphs.ai/
16•graphpilled•1h ago•3 comments

Dave Farber has died

https://lists.nanog.org/archives/list/nanog@lists.nanog.org/thread/TSNPJVFH4DKLINIKSMRIIVNHDG5XKJCM/
51•vitplister•1h ago•7 comments

Matchlock – Secures AI agent workloads with a Linux-based sandbox

https://github.com/jingkaihe/matchlock
54•jingkai_he•5h ago•14 comments

Reverse Engineering Raiders of the Lost Ark for the Atari 2600

https://github.com/joshuanwalker/Raiders2600
26•pacod•4h ago•1 comments

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

https://github.com/localgpt-app/localgpt
261•yi_wang•12h ago•131 comments

Curating a Show on My Ineffable Mother, Ursula K. Le Guin

https://hyperallergic.com/curating-a-show-on-my-ineffable-mother-ursula-k-le-guin/
18•bryanrasmussen•3h ago•10 comments

Beyond agentic coding

https://haskellforall.com/2026/02/beyond-agentic-coding
167•RebelPotato•11h ago•47 comments

Rabbit Ear "Origami": programmable origami in the browser (JS)

https://rabbitear.org/book/origami.html
29•molszanski•3d ago•3 comments

The Legacy of Daniel Kahneman: A Personal View (2025)

https://ejpe.org/journal/article/view/1075/753
21•cainxinth•3d ago•0 comments

SectorC: A C Compiler in 512 bytes (2023)

https://xorvoid.com/sectorc.html
330•valyala•19h ago•65 comments

LLMs as the new high level language

https://federicopereiro.com/llm-high/
149•swah•5d ago•281 comments

The Architecture of Open Source Applications (Volume 1) Berkeley DB

https://aosabook.org/en/v1/bdb.html
53•grep_it•5d ago•8 comments

Software factories and the agentic moment

https://factory.strongdm.ai/
250•mellosouls•22h ago•404 comments

A11yJSON: A standard to describe the accessibility of the physical world

https://sozialhelden.github.io/a11yjson/
13•robin_reala•5d ago•2 comments

Speed up responses with fast mode

https://code.claude.com/docs/en/fast-mode
202•surprisetalk•19h ago•214 comments

Modern and Antique Technologies Reveal a Dynamic Cosmos

https://www.quantamagazine.org/how-modern-and-antique-technologies-reveal-a-dynamic-cosmos-20260202/
12•sohkamyung•5d ago•0 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
201•AlexeyBrin•1d ago•42 comments

Arcan Explained – A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
6•walterbell•5h ago•0 comments

uLauncher

https://github.com/jrpie/launcher
43•dtj1123•5d ago•17 comments

Stories from 25 Years of Software Development

https://susam.net/twenty-five-years-of-computing.html
219•vinhnx•22h ago•26 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
383•jesperordrup•1d ago•123 comments

Brookhaven Lab's RHIC concludes 25-year run with final collisions

https://www.hpcwire.com/off-the-wire/brookhaven-labs-rhic-concludes-25-year-run-with-final-collis...
87•gnufx•18h ago•66 comments

Wood Gas Vehicles: Firewood in the Fuel Tank (2010)

https://solar.lowtechmagazine.com/2010/01/wood-gas-vehicles-firewood-in-the-fuel-tank/
62•Rygian•3d ago•31 comments

First Proof

https://arxiv.org/abs/2602.05192
164•samasblack•22h ago•97 comments

LineageOS 23.2

https://lineageos.org/Changelog-31/
103•pentagrama•8h ago•30 comments

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

https://github.com/Momciloo/fun-with-clip-path
123•momciloo•19h ago•31 comments

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
631•theblazehen•3d ago•228 comments

In the Australian outback, we're listening for nuclear tests

https://www.abc.net.au/news/2026-02-08/australian-outback-nuclear-tests-listening-warramunga-faci...
27•defrost•3h ago•4 comments