No LLM needed to run tests. It reads your type hints, generates a Pydantic schema, and deterministically breaks things.
pip install py-toolguard
GitHub: https://github.com/Harshit-J004/toolguard
If you are building complex tool chains, I would be incredibly honored if you checked out the repo. Brutal feedback on the architecture is highly encouraged!
calebjang•1h ago
If the tool boundary itself is unstable — wrong field names, wrong types, missing required values — the rest of the stack doesn’t really matter.
Deterministic fuzzing for tool calls seems especially useful because it gives you a way to test execution reliability without depending on another model in the loop.
Heer_J•23m ago
I've spend so much time obsessing over which model is smarter or tweaking our system prompts, but if a simple None value causes a backend TypeError that crashes the whole runtime, none of that reasoning truly matters. The agent is basically dead in the water.
By fuzzing the execution environment deterministically first, we can guarantee the "hands" of the agent work perfectly before we even bother testing the "brain". Plus, it runs in milliseconds during CI/CD instead of waiting for expensive API calls.