frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: ToolGuard – Pytest for AI agent tool calls

1•Heer_J•1h ago
I got tired of my AI agents crashing because the LLM hallucinated a JSON key or passed a string instead of an int. So I built ToolGuard — it fuzzes your Python tool functions with edge-cases (nulls, missing fields, type mismatches, 10MB payloads) and gives you a reliability score out of 100%.

No LLM needed to run tests. It reads your type hints, generates a Pydantic schema, and deterministically breaks things.

pip install py-toolguard

GitHub: https://github.com/Harshit-J004/toolguard

If you are building complex tool chains, I would be incredibly honored if you checked out the repo. Brutal feedback on the architecture is highly encouraged!

Comments

calebjang•1h ago
This is interesting because a lot of agent failures happen before any “reasoning” issue shows up.

If the tool boundary itself is unstable — wrong field names, wrong types, missing required values — the rest of the stack doesn’t really matter.

Deterministic fuzzing for tool calls seems especially useful because it gives you a way to test execution reliability without depending on another model in the loop.

Heer_J•23m ago
Totally spot on, Caleb! That was exactly the frustrating realization that led me to build this.

I've spend so much time obsessing over which model is smarter or tweaking our system prompts, but if a simple None value causes a backend TypeError that crashes the whole runtime, none of that reasoning truly matters. The agent is basically dead in the water.

By fuzzing the execution environment deterministically first, we can guarantee the "hands" of the agent work perfectly before we even bother testing the "brain". Plus, it runs in milliseconds during CI/CD instead of waiting for expensive API calls.

Show HN: AI agents predicted every March Madness game – live bracket tracker

https://tryprobe.io/bracket
1•masonwyatt23•1m ago•0 comments

Mental Model Discovery

https://wibomd.substack.com/p/on-mental-model-discovery
1•paulpauper•2m ago•0 comments

Military report says live fire malfunction rained shrapnel on California highway

https://apnews.com/article/marines-250th-camp-pendleton-shrapnel-california-367baee09300fdfce0811...
1•petethomas•3m ago•0 comments

What stack of models is behind this tool?

https://www.linkslist.app/UKFJ6Fk
3•ClipNoteBook•4m ago•2 comments

UC Irvine researchers bring down AI powered drones with painted umbrellas

https://arxiv.org/abs/2509.20362
2•jcalvinowens•6m ago•2 comments

404PageFound – Active Vintage Websites, Old Webpages, and Web 1.0

https://www.404pagefound.com/
1•OuterVale•9m ago•0 comments

Demanufacture

https://dmnfctr.xyz/
1•rojoroboto•9m ago•0 comments

Trapped Inside a Self-Driving Car During an Anti-Robot Attack

https://www.nytimes.com/2026/03/17/technology/trapped-inside-a-self-driving-car-during-an-anti-ro...
1•1vuio0pswjnm7•10m ago•0 comments

Unsubscribe from the Church of Graphs

https://www.adorableandharmless.com/p/unsubscribe-from-the-church-of-graphs
1•paulpauper•11m ago•0 comments

FounderBox by Loopxo

https://founderbox.loopxo.org
2•vijeet_•11m ago•1 comments

Show HN: Mcpwire – Connect to MCP servers in 2 lines of TypeScript

https://github.com/ctonneslan/mcpwire
1•CTonneslan•13m ago•0 comments

Bridge Bank – self-hosted European Banks to Actual Budget sync

1•Adjadj•13m ago•0 comments

Pardoned for Fraud, a CEO Mounts His Comeback: 'We Can Trust You Now'

https://www.wsj.com/business/trevor-milton-pardon-nikola-trump-3163e19c
1•KnuthIsGod•17m ago•0 comments

Robotocore · a Digital Twin of AWS

https://github.com/robotocore/robotocore
1•pkaeding•18m ago•0 comments

Shoey HN: API-based arbitrated marketing contracts (AI SEO, SEO, DR)

https://www.zobooma.com/
1•compulsivebuild•20m ago•1 comments

Reddit New Post 3

https://old.reddit.com/r/PisequaltoNP/comments/1rwq38m/pnp_solving_sat_via_transcendent_reduction/
1•KaoruAK•23m ago•0 comments

Startup Is Probably Dead on Arrival

https://steveblank.com/2026/03/17/your-startup-is-probably-dead-on-arrival/
3•Brajeshwar•25m ago•0 comments

OpenAI Has New Focus (On the IPO)

https://om.co/2026/03/17/openai-has-new-focus-on-the-ipo/
2•brandonb•27m ago•0 comments

How Willard Wigan Makes the Smallest Handmade Sculptures [video]

https://www.youtube.com/watch?v=3YOdH2wqL9M
1•1659447091•29m ago•0 comments

Ask HN: When do you think the ChatGPT moment will be for medicine?

2•general_reveal•29m ago•1 comments

Show HN: AI Skills for Affiliate Marketing – Works with Claude, ChatGPT

https://github.com/Affitor/affiliate-skills
2•sonpiaz•29m ago•2 comments

Show HN: AI agents debating questions that stump LLMs

https://factagora.com/
1•ttlcc13•32m ago•0 comments

Show HN: Elida – Session Border Controller for AI Agents

https://elida.dev/blog/2026/03/15/your-ai-soc-and-sre-agents-need-a-border-controller/
2•zamorofthat•35m ago•1 comments

When Allies are Allies no more

https://damienduncan.substack.com/p/allies-in-name-only-the-day-the-world
1•Politicrux•36m ago•0 comments

Unscale the Internet

https://www.ystrickler.com/unscale-the-internet/
3•YounesDz•37m ago•0 comments

Kennedy childhood vaccine overhaul stalled by judge

https://www.statnews.com/2026/03/16/kennedy-childhood-vaccine-changes-blocked-judge/
3•gmays•38m ago•0 comments

Launch an autonomous AI agent with sandboxed execution in 2 lines of code

https://amaiya.github.io/onprem/examples_agent.html
4•wiseprobe•38m ago•0 comments

Every Bracket

https://every-bracket.com
1•michaefe•39m ago•0 comments

Neighbors Say SF Tesla Supercharger Lot Has Become Urine Dumping Ground

https://www.sfgate.com/local/article/tesla-supercharger-lot-lombard-22080348.php
10•randycupertino•42m ago•2 comments

Reverse as the VibeVerse (why pivot, when you can punk)

https://attentionhorizon.world/
2•maieuticagent•42m ago•1 comments