frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Chaos Testing for LLM Agents

https://github.com/arielshad/balagan-agent
1•ArielSh•1h ago

Comments

ArielSh•1h ago
open-source experiment applying chaos-engineering ideas to LLM agents.

Agent development feels fast, but reliability is mostly assumed. Agents depend on prompts, tools, APIs, and implicit coordination. When something breaks, behavior degrades in subtle ways and we usually find out too late.

Balagan Agent intentionally injects controlled “chaos” into agent workflows to surface failure modes early: - Tool failures, latency, partial responses - Prompt drift and unexpected decisions - Hidden assumptions in sequencing and coordination

The goal is not load testing, but understanding how fragile an agent really is and where guardrails are needed.

This started as a side project to explore whether chaos-style testing makes sense for agents, similar in spirit to what Chaos Monkey did for distributed systems.

chrisjj•1h ago
> Problem

> * Agents fail silently in production

> * Tool calls time out, return garbage, or hallucinate

> * Context gets corrupted, budgets get exhausted

> * Nobody knows until users complain

Nobody? You can't blame a sick parrot for its keeper's failure to monitor it.

What I found reading Claude's leaked 57K-word system prompts

1•jbetala7•38s ago•1 comments

Show HN: KnowledgeForAI – remote MCP for various data sources

https://knowledgeforai.com/
1•winchester6788•1m ago•0 comments

Tell HN: Beeper deletes inactive accounts without notice

1•kldx•1m ago•0 comments

Patients Are Often More Honest with AI Than Clinicians [video]

https://www.youtube.com/watch?v=97HLETD7CGY
1•vitlyoshin•2m ago•1 comments

Show HN: Visual bug reports with screenshots, console logs, and network requests

https://feedbackotter.com
1•mohitgangrade•5m ago•1 comments

Younger Americans see U.S. dominance slipping to China

https://www.axios.com/2026/01/28/american-gen-z-china-competition-economics
2•giuliomagnifico•5m ago•0 comments

Project Genie: An experimental research prototype

https://www.threads.com/@google/post/DUGhcK8kvX-
1•simonpure•5m ago•0 comments

Claude and I have a proper first date

https://h4x0r.org/a-date-with-claude/
1•eatonphil•5m ago•0 comments

EU/CoE country badge-generator

https://country-badges.eu/
2•AxelWickman•5m ago•0 comments

Verge: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

https://arxiv.org/abs/2601.20055
1•vikashjohn2505•5m ago•1 comments

What escapes containment is least valuable

https://hollisrobbinsanecdotal.substack.com/p/what-escapes-containment-is-less
1•HR01•9m ago•0 comments

Common Plastic Chemical BPA Found to Feminize Males and Masculinize Females

https://scitechdaily.com/common-plastic-chemical-found-to-feminize-males-and-masculinize-females/
1•OutOfHere•9m ago•0 comments

Krawl: A honeypot and deception server one month lather

https://demo.krawlme.com/das_dashboard
2•blessedrebus•10m ago•1 comments

Royal Navy forces Russian ship out of British waters

https://www.telegraph.co.uk/news/2026/01/28/russian-ship-anchors-trans-atlantic-cables-bristol-ch...
1•speckx•10m ago•0 comments

Milky Way is embedded in a 'large-scale sheet' of dark matter

https://phys.org/news/2026-01-milky-embedded-large-scale-sheet.html
1•rbanffy•11m ago•0 comments

Programming as Theory Building [pdf]

https://pablo.rauzy.name/dev/naur1985programming.pdf
3•SchwKatze•14m ago•0 comments

Building Cryptographic Agility into Sigstore

https://blog.trailofbits.com/2026/01/29/building-cryptographic-agility-into-sigstore/
1•CiPHPerCoder•16m ago•0 comments

Ask HN: How do you evaluate whether a CV research idea is worth pursuing?

1•mostlyk•17m ago•0 comments

Adding dynamic features to an aggressively cached website

https://simonwillison.net/2026/Jan/28/dynamic-features-static-site/
1•ulrischa•17m ago•0 comments

South Korea's 'world-first' AI laws face pushback

https://www.theguardian.com/world/2026/jan/29/south-korea-world-first-ai-regulation-laws
1•lnguyen•18m ago•0 comments

Show HN: Guide to Writing Better AI Prompts

https://howtomakethebestprompt.com/
1•detroitwebsites•20m ago•0 comments

The Largest Zip Tie Is Nearly 4 Feet Long and $75

https://www.thedrive.com/news/youll-have-that-on-those-big-jobs-the-worlds-largest-zip-tie-is-nea...
1•PaulHoule•20m ago•0 comments

Shift more left with coding agents

https://gricha.dev/blog/shift-more-left-with-coding-agents
2•surprisetalk•21m ago•0 comments

FAQ: Memorization

https://pgadey.ca/notes/faq-memorization/
1•surprisetalk•21m ago•0 comments

Plantable Brings Plants and Tables Together in the Workplace

https://design-milk.com/plantable-brings-plants-and-tables-together-in-the-workplace/
2•surprisetalk•21m ago•0 comments

Attilio Berni plays the sub-contrabass saxophone [video]

https://www.youtube.com/watch?v=9BiW2mVKk0w
1•surprisetalk•21m ago•0 comments

Project Genie: Interactive worlds generated in real-time

https://labs.google/projectgenie
7•jedixit•22m ago•0 comments

TikTok Competitor UpScrolled Hits No. 1 on App Store

https://www.forbes.com/sites/conormurray/2026/01/29/tiktok-competitor-upscrolled-hits-no-1-on-app...
2•nullchan•22m ago•0 comments

Agentic Vision in Gemini 3 Flash

https://blog.google/innovation-and-ai/technology/developers-tools/agentic-vision-gemini-3-flash/
3•pretext•23m ago•0 comments

Joybubbles, early phone phreak, Documentary

https://www.joybubblesthemovie.com
2•ChrisArchitect•24m ago•2 comments