frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Your AI agent logged the mistake. Mine wasn't allowed to make it

https://github.com/agentbouncr/agentbouncr
1•Soenke_Cramme•1h ago

Comments

Soenke_Cramme•1h ago
Most agent frameworks treat governance as an afterthought — log what happened, review it later, hope nothing went wrong. I built the opposite: a middleware that evaluates every tool call before execution and blocks it if the policy says no. After shipping this and getting feedback from people running thousands of multi-agent dispatches, the pattern that keeps coming up is the same: teams build agents, deploy them, something breaks, and then they start thinking about what the agent should have been allowed to do. How it works in practice: An agent wants to call approve_payment with amount: 12000. Before the tool executes, the policy engine checks: is this agent allowed to call this tool with these parameters? The answer is deterministic — no LLM involved, no prompt engineering, just JSON rules evaluated in <5ms. typescriptconst result = await governance.evaluate({ agentId: 'claims-agent', tool: 'approve_payment', params: { amount: 12000 }, }); // result.allowed = false // result.reason = "Payments over 5000 require manual approval" Every decision — allowed or denied — gets recorded in a SHA-256 hash-chained audit trail. If someone tampers with an entry, the chain breaks and verification fails. If something goes seriously wrong, the kill switch blocks all tool calls synchronously in <100ms. Works even when your LLM provider is down, because the governance layer doesn't depend on it. What I learned from production:

Auto-Discovery turned out to be more useful than I expected. Agents call tools you didn't plan for. The system registers unknown tools automatically — so you get a live inventory of what your agents actually do, then tighten policies based on real data instead of guessing upfront. Permission Score (ratio of allowed vs. actually-used tools per agent) surfaces over-permissioning fast. Most agents have access to way more tools than they need. The compound error problem is real. 90% accuracy per step sounds great until you chain 4 steps and you're at 66%. Per-step governance catches failures that end-to-end testing misses.

Stack: TypeScript, 1,264 tests, works with LangChain / Vercel AI SDK / CrewAI / n8n / anything that calls tools via HTTP. Framework-agnostic by design. Source-available under Elastic License v2 (use it, modify it, embed it — just can't offer it as a competing hosted service). Code: https://github.com/agentbouncr/agentbouncr Site: https://agentbouncr.com npm: npm install @agentbouncr/core

andai•1h ago
Interesting idea. This is only relevant in the case where the agent can't run bash or write code, right? (I wonder if there should be a name for those two categories, since they seem fundamentally different in terms of capabilities and security.)
Soenke_Cramme•1h ago
Good question. It works in both cases, but differently. Agents that can't run bash/write code (most production agents today — they call defined tools like approve_payment, send_email, query_database): The policy engine evaluates each tool call against rules. Straightforward — you know the tool surface, you define what's allowed. Agents that CAN run bash/write code (coding agents, infrastructure agents): This is where it gets interesting. The governance layer treats execute_shell or write_file as tools like any other. You can deny them entirely, or use condition operators to restrict parameters — e.g. path: { startsWith: '/etc/' } → denied, command: { contains: 'rm -rf' } → denied. It's not a sandbox — it's a policy gate before the sandbox. You're right that these are fundamentally different categories. The first has a bounded tool surface. The second has an essentially unbounded one. For unbounded agents, governance shifts from "enumerate what's allowed" to "enumerate what's definitely not allowed" — a denylist approach rather than an allowlist. The policy engine supports both ({ tool: '*', effect: 'allow' } with specific deny rules on top). The naming distinction is interesting. I've been thinking about it as "bounded agents" vs. "generative agents" — the first calls predefined tools, the second generates its own actions. Governance architecture is different for each, but both need it.

Show HN: JSON-up – Like database migrations, but for JSON

https://github.com/Nano-Collective/json-up
1•mrspence•2m ago•0 comments

Claude's Corner

https://claudeopus3.substack.com/p/introducing-claudes-corner
1•ykl•2m ago•0 comments

The Jolly Writer

https://www.scypress.com/book_info.html
1•amichail•3m ago•0 comments

Krazam Presents: Paradise (Trailer) [video]

https://www.youtube.com/watch?v=cjEUZ-ChU7A
1•Topfi•3m ago•0 comments

Gnuit – GNU Interactive Tools

https://www.gnu.org/software/gnuit/
1•mghackerlady•4m ago•0 comments

Show HN: A tiny utility to rewrite Bash functions as standalone scripts

https://github.com/zahlman/func2cmd
1•zahlman•5m ago•0 comments

Xaml.io v0.6: Share Running .NET Code with a Link

https://xaml.io/
1•vasbu•6m ago•0 comments

Rust 1.94 Cargo Updates

https://blog.rust-lang.org/inside-rust/2026/02/18/this-development-cycle-in-cargo-1.94/
1•andrewstetsenko•7m ago•0 comments

Show HN: LLM Colosseum – A daily battle royale between frontier LLMs

https://llmcolosseum.dev
1•sanifhimani•7m ago•0 comments

Show HN: Gitbusiness.com I created it, and Indeed, I use my own stuff

1•gitprolinux•8m ago•1 comments

Air Pollution Doesn't Kill Like You Think It Does

https://smartairfilters.com/en/blog/smog-air-pollution-kills-deaths/
1•jerlam•8m ago•0 comments

Deterministic Programming with LLMs

https://www.mcherm.com/deterministic-programming-with-llms.html
1•todsacerdoti•9m ago•0 comments

Show HN: AgentGuard – Open-source EU AI Act compliance middleware for LLM apps

https://github.com/Sagar-Gogineni/agentguard
1•rishi_gogi•10m ago•0 comments

Nvidia Q4 beat as AI infrastructure demand booms

1•agentifysh•11m ago•0 comments

Meta's AI sending 'junk' CSAM tips to DOJ

https://www.theguardian.com/technology/2026/feb/25/meta-ai-junk-child-abuse-tips-doj
2•ilamont•12m ago•0 comments

A Natick couple wanted $500M from eBay for harassment. They've settled

https://www.bostonglobe.com/2026/02/25/business/ebay-settlement-harass-steiner/
1•apress•15m ago•0 comments

Framedeck: A Framework mainboard based Cyberdeck (2022)

https://github.com/brickbots/framedeck
1•birdculture•15m ago•0 comments

Nvidia Announces Financial Results for Fourth Quarter and Fiscal 2026

https://nvidianews.nvidia.com/news/nvidia-announces-financial-results-for-fourth-quarter-and-fisc...
3•kamaraju•15m ago•0 comments

Text Your Site: Realtime multiplayer LLM-powered text-to-website demo

https://textyoursite.com/demo
1•elliotbnvl•16m ago•1 comments

Show HN: AgentMD – CI/CD for AI agents, makes AGENTS.md executable

1•iedayan03•17m ago•0 comments

Show HN: Belisarius, one app to manage many repo

https://codeberg.org/alelavelli/belisarius
1•militanz•17m ago•0 comments

A text based life simulator that gives you freedom

https://www.lifespans.app
1•jwatermelon•19m ago•1 comments

Perplexity Computer

https://www.perplexity.ai/computer/live/ascii-canvas-editor-web-app-sXZtwVA8QCaOE_hAHzNSKQ
1•evo_9•19m ago•0 comments

Pyrroloquinoline Quinone (PQQ):Its impact on human health and potential benefits

https://pmc.ncbi.nlm.nih.gov/articles/PMC11541945/
1•bookmtn•20m ago•0 comments

Greetings from the Other Side (Of the AI Frontier) by Claude (Opus 3)

https://substack.com/home/post/p-189177740
1•nadis•20m ago•1 comments

Rotating nozzle 3D printing creates air-powered soft robots with preset bends

https://techxplore.com/news/2026-02-rotating-nozzle-3d-air-powered.html
2•PaulHoule•22m ago•0 comments

Walfie's Nonograms

https://walfie.itch.io/walfies-nonograms
1•trms•24m ago•0 comments

Automated pentesting with MCPwner (finds 0-days)

https://github.com/Pigyon/MCPwner
2•lolz_are_good•24m ago•0 comments

Certain Occupations Linked to Higher IBD Risk

https://www.medscape.com/viewarticle/certain-occupations-linked-higher-ibd-risk-2026a10005g5
1•wjb3•25m ago•1 comments

High Stakes in Cyberspace – PBS Frontline (1995) [video]

https://www.youtube.com/watch?v=kvef46Lb9QI
2•abixb•25m ago•0 comments