not adding LLM layers to stuff to make them inherently less secure.
This will be a neat concept for the types of tools that come after the present iteration of LLMs.
Unless I’m sorely mistaken.
EDIT: it does seem to have a deterministic layer too and I think that's great
Edit: actually looks like it has two policy engines embedded
If people said "we build a ML-based classifier into our proxy to block dangerous requests" would it be better? Why does the fact the classifier is a LLM make it somehow worse?
yakkomajuri•1h ago
I think you're spot on with the fact that it's so far it's been either all or nothing. You either give an agent a lot of access and it's really powerful but proportionally dangerous or you lock it down so much that it's no longer useful.
I like a lot of the ideas you show here, but I also worry that LLM-as-a-judge is fundamentally a probabilistic guardrail that is inherently limited. How do you see this? It feels dangerous to rely on a security system that's not based on hard limitations but rather probabilities?