frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Flakestorm – Chaos engineering for AI agents (local-first, open source)

6•frankhumarang•1d ago
Hi everyone,

I’ve been working on an open-source tool called Flakestorm to test the reliability of AI agents before they hit production.

Most agent testing today focuses on eval scores or happy-path prompts. In practice, agents tend to fail in more mundane ways: typos, tone shifts, long context, malformed input, or simple prompt injections — especially when running on smaller or local models. Flakestorm applies chaos-engineering ideas to agents. Instead of testing one prompt, it takes a “golden prompt”, generates adversarial mutations (semantic variations, noise, injections, encoding edge cases), runs them against your agent, and produces a robustness score plus a detailed HTML report showing what broke.

Key points: Local-first (uses Ollama for mutation generation)

Tested with Qwen / Gemma / other small models Works against HTTP agents, LangChain chains, or Python callables No cloud or API keys required This started as a way to debug my own agents after seeing them behave unpredictably under real user input. I’m still early and trying to understand how useful this is outside my own workflow.

I’d really appreciate feedback on: Whether this overlaps with how you test agents today Failure modes you’ve seen that aren’t covered Whether “chaos testing for agents” is a useful framing, or if this should be thought of differently Repo: https://github.com/flakestorm/flakestorm Docs are admittedly long.

Thanks for taking a look.

Show HN: Mailchimp alternative, self-hosted and yours

https://maillayer.com
1•mddanishyusuf•20s ago•0 comments

Ask HN: What parts of software testing can realistically be autonomous today?

1•nishilpatel•1m ago•0 comments

ClickFix attack uses fake Windows BSOD screens to push malware

https://www.bleepingcomputer.com/news/security/clickfix-attack-uses-fake-windows-bsod-screens-to-...
1•speckx•1m ago•0 comments

Start a Blog

https://maurycyz.com/misc/starting_a_blog/
1•thomasjb•1m ago•0 comments

Show HN: Utopia Planning Session

https://stateofutopia.com/p2p-ring-stream/
1•logicallee•7m ago•0 comments

Is the iPhone 17 the First Un-Breakable Phone? [video]

https://www.youtube.com/watch?v=UVD0fbiNbnM
1•mgh2•9m ago•0 comments

Ask HN: Is it even possible to crack a Wi-Fi password from a phone?

1•DenisDolya•11m ago•0 comments

Noi: A workspace browser for parallel AI workflows and session isolation

https://github.com/lencx/Noi
1•handystudio•11m ago•1 comments

We Deleted Our Vector Database

https://www.turingmind.ai/blogs/local-deleted-vector-database
1•vinkupa•13m ago•0 comments

Gitix – The Collaboration Layer for AI Intelligence

https://gitix.ai/
1•azolf•14m ago•0 comments

Taxpayers miss out on millions after 'phoenixism' at UK recruitment firms

https://www.theguardian.com/business/2026/jan/06/recruitment-firms-phoenixism-liquidation-avoid-t...
2•piqufoh•22m ago•0 comments

Debunking the AI food delivery hoax that fooled Reddit

https://www.platformer.news/fake-uber-eats-whisleblower-hoax-debunked/
3•hoopla_ching•25m ago•0 comments

Are There Fourth Amendment Rights in Google Search Terms?

https://reason.com/volokh/2025/12/16/are-there-fourth-amendment-rights-in-google-search-terms/
1•Jimmc414•27m ago•0 comments

WebAssmbly in Containers in 2026?

https://thenewstack.io/wasi-1-0-you-wont-know-when-webassembly-is-everywhere-in-2026/
1•ddmng•28m ago•0 comments

Show HN: Is this the best epoch converter?

https://epochconverter.dev/
2•subhash_k•31m ago•0 comments

How Much Are GitHub Stars Worth to You? [2023]

https://the-guild.dev/blog/judging-open-source-by-github-stars
1•alexpadula•31m ago•4 comments

Tools for finding Busy Beaver Turing Machines and Proving others as non-halting

https://github.com/sligocki/busy-beaver
1•frozenseven•32m ago•0 comments

Buffer overflow in /bin/su from Unix v4

https://www.openwall.com/lists/oss-security/2026/01/05/10
3•Deeg9rie9usi•34m ago•0 comments

Show HN: Turn Meetings into Presentations with AI

https://notefy.pro/
2•jimmydin7•36m ago•0 comments

The art of text rendering [video]

https://app.media.ccc.de/v/39c3-the-art-of-text-rendering
2•internet_points•36m ago•0 comments

SCiZE's Classic Warez Collection

https://scenelist.org/
12•achairapart•36m ago•2 comments

Show HN: Imgity – AI Image Creation and Editing Using Nano Banana Pro

https://imgity.com/
1•SherlockShi•38m ago•0 comments

"Free" Docker Hardened Images poses a security risk

https://github.com/orgs/docker-hardened-images/discussions/101
1•cyanboy•38m ago•0 comments

MinIO: Update README.md with latest free license and enterprise option

https://github.com/minio/minio/commit/be7800c8136eadff2ba012412dd6c2e5fdcb548a
1•tamnd•39m ago•0 comments

Show HN: Milkyboard – Synth Keyboard with Milkdrop Visualizer

https://milkyboard.com/
1•amadeuspagel•40m ago•0 comments

Dealing with Faulty RAM in 2026

https://blog.kumio.org/posts/2026/01/memtest86plus.html
2•kumiokun•41m ago•0 comments

Keys to Understanding Trump's Retro Coup in Venezuela

https://www.wired.com/story/3-keys-understanding-trumps-retro-coup-in-venezuela/
1•quapster•43m ago•0 comments

AI-generated sensors open new paths for early cancer detection

https://news.mit.edu/2026/ai-generated-sensors-open-new-paths-early-cancer-detection-0106
2•fleahunter•48m ago•0 comments

TypeScript, from Structural to Nominal Typing

https://nanamanu.com/posts/branded-types-typescript/
1•claeusdev•49m ago•0 comments

Enemies not allowed to control large oil reserves, US ambassador to UN says

https://www.middleeasteye.net/news/enemies-not-allowed-control-large-oil-reserves-us-ambassador-u...
2•haritha-j•49m ago•3 comments