frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production

https://www.brex.com/crabtrap
59•pedrofranceschi•9h ago
https://www.brex.com/journal/building-crabtrap-open-source

Comments

yakkomajuri•1h ago
Really cool! I'm also building something in this space but taking a slightly different approach. I'm glad to see more focus on security for production agentic workflows though, as I think we don't talk about it enough when it comes to claws and other autonomous agents.

I think you're spot on with the fact that it's so far it's been either all or nothing. You either give an agent a lot of access and it's really powerful but proportionally dangerous or you lock it down so much that it's no longer useful.

I like a lot of the ideas you show here, but I also worry that LLM-as-a-judge is fundamentally a probabilistic guardrail that is inherently limited. How do you see this? It feels dangerous to rely on a security system that's not based on hard limitations but rather probabilities?

DANmode•1h ago
We’re supposed to be fixing LLM security by adding a non-LLM layer to it,

not adding LLM layers to stuff to make them inherently less secure.

This will be a neat concept for the types of tools that come after the present iteration of LLMs.

Unless I’m sorely mistaken.

SkyPuncher•1h ago
Defense in depth. Layers don't inherently make something less secure. Often, they make it more secure.
yakkomajuri•1h ago
I do think this is likely to make things more secure but it's also dangerous by potentially giving users a false sense of complete security when the security layer is probabilistic rather than deterministic.

EDIT: it does seem to have a deterministic layer too and I think that's great

reassess_blind•1h ago
It looks as if this tool has traditional static rules to allow/deny requests, as well as a secondary LLM-as-a-judge layer for, I imagine, the kinds of rules that would be messy or too convoluted to implement using standard rules.
snug•1h ago
I think this can be great as additional layer of security. Where you can have a non llm layer do some analysis with some static rules and then if something might seem phishy run it through the llm judge so that you don’t have to run every request through it, which would be very expensive.

Edit: actually looks like it has two policy engines embedded

windexh8er•54m ago
And we don't think the judge can/will be gamed? Also... It's an LLM, it's going to add delay and additional token burn. One subjective black box protecting another subjective black box. I mean, what couldn't go wrong?
ImPostingOnHN•54m ago
What happens when a prompt injection attack exploits the judge LLM and results in a higher level of attacker control than if it never existed?
vova_hn2•10m ago
How can it result in a higher level of control? I don't see why the "judge" should have access to anything except one tool that allows it to send an "accept" or "deny" command.
nl•15m ago
> We’re supposed to be fixing LLM security by adding a non-LLM layer to it,

If people said "we build a ML-based classifier into our proxy to block dangerous requests" would it be better? Why does the fact the classifier is a LLM make it somehow worse?

Retr0id•3m ago
The fact that LLMs are "smarter" is also their weakness. An oldschool classifier is far from foolproof, but you won't get past it by telling it about your grandma's bedtime story routine.
roywiggins•16m ago
It's all fine until OpenClaw decides to start prompt injecting the judge

ChatGPT Images 2.0

https://openai.com/index/introducing-chatgpt-images-2-0/
344•wahnfrieden•5h ago•357 comments

SpaceX says it has agreement to acquire Cursor for $60B

https://twitter.com/spacex/status/2046713419978453374
136•dmarcos•2h ago•192 comments

The Vercel breach: OAuth attack exposes risk in platform environment variables

https://www.trendmicro.com/en_us/research/26/d/vercel-breach-oauth-supply-chain.html
247•queenelvis•7h ago•94 comments

CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production

https://www.brex.com/crabtrap
59•pedrofranceschi•9h ago•12 comments

Stephen's Sausage Roll remains one of the most influential puzzle games

https://thinkygames.com/features/10-years-of-grilling-stephens-sausage-roll-remains-one-of-the-mo...
122•tobr•3d ago•56 comments

Britannica11.org – a structured edition of the 1911 Encyclopædia Britannica

https://britannica11.org/
203•ahaspel•7h ago•85 comments

Framework Laptop 13 Pro

https://frame.work/laptop13pro
863•Trollmann•6h ago•477 comments

Laws of Software Engineering

https://lawsofsoftwareengineering.com
802•milanm081•13h ago•407 comments

Cal.diy: open-source community edition of cal.com

https://github.com/calcom/cal.diy
141•petecooper•6h ago•36 comments

Meta to start capturing employee mouse movements, keystrokes for AI training

https://www.reuters.com/sustainability/boards-policy-regulation/meta-start-capturing-employee-mou...
290•dlx•6h ago•258 comments

Fields Medal Video: Maryna Viazovska (2022)

https://www.simonsfoundation.org/2022/07/05/fields-medal-video-maryna-viazovska/
16•ganitam•1d ago•4 comments

Edit store price tags using Flipper Zero

https://github.com/i12bp8/TagTinker
275•trueduke•2d ago•267 comments

Changes to GitHub Copilot individual plans

https://github.blog/news-insights/company-news/changes-to-github-copilot-individual-plans/
292•zorrn•1d ago•78 comments

Zindex – Diagram Infrastructure for Agents

https://zindex.ai/
28•_ben_•4h ago•11 comments

Windows Server 2025 Runs Better on ARM

https://jasoneckert.github.io/myblog/server-2025-arm64/
5•jasoneckert•2d ago•1 comments

Theseus, a Static Windows Emulator

https://neugierig.org/software/blog/2026/04/theseus.html
70•zdw•1d ago•9 comments

My practitioner view of program analysis

https://sawyer.dev/posts/practitioner-program-analysis/
26•evakhoury•1d ago•4 comments

Running a Minecraft Server and More on a 1960s Univac Computer

https://farlow.dev/2026/04/17/running-a-minecraft-server-and-more-on-a-1960s-univac-computer
189•brilee•3d ago•30 comments

Show HN: GoModel – an open-source AI gateway in Go

https://github.com/ENTERPILOT/GOModel/
155•santiago-pl•10h ago•61 comments

In the UK, EVs are cheaper than petrol cars, thanks to Chinese competition

https://electrek.co/2026/04/18/in-the-uk-evs-are-cheaper-than-petrol-cars-thanks-to-chinese-compe...
114•breve•2d ago•90 comments

Trellis AI (YC W24) Is hiring engineers to build self-improving agents

https://www.ycombinator.com/companies/trellis-ai/jobs/SvzJaTH-member-of-technical-staff-product-e...
1•macklinkachorn•7h ago

Show HN: VidStudio, a browser based video editor that doesn't upload your files

https://vidstudio.app/video-editor
236•kolx•12h ago•80 comments

Claude Code to be removed from Pro Tier?

https://bsky.app/profile/edzitron.com/post/3mjzxwfx3qs2a
216•johnduhart•2h ago•139 comments

Show HN: Backlit Keyboard API for Python

https://github.com/itsmeadarsh2008/backlit-kbd
16•itsmeadarsh•2d ago•2 comments

MNT Reform is an open hardware laptop, designed and assembled in Germany

http://mnt.stanleylieber.com/reform/
274•speckx•1d ago•104 comments

The Mystery of Rennes-Le-Château, Part 4: Non-Fiction Meets Fiction

https://www.filfre.net/2026/04/the-mystery-of-rennes-le-chateau-part-4-non-fiction-meets-fiction/
7•ibobev•3d ago•0 comments

A type-safe, realtime collaborative Graph Database in a CRDT

https://codemix.com/graph
145•phpnode•14h ago•43 comments

Ibuilt a tiny Unix‑like 'OS' with shell and filesystem for Arduino UNO (2KB RAM)

https://github.com/Arc1011/KernelUNO
62•Arc1011•7h ago•13 comments

Kasane: New drop-in Kakoune front end with GPU rendering and WASM Plugins

https://github.com/Yus314/kasane
47•nsagent•8h ago•5 comments

I don't want your PRs anymore

https://dpc.pw/posts/i-dont-want-your-prs-anymore/
191•speckx•4h ago•115 comments