frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Why I'm moving away from Regex for LLM Agent security

2•aunicall•4h ago
I’ve been auditing how open-source execution engines handle prompt injection. Most of them (like OpenClaw) rely on a 3-layer static defense: regex blacklists, XML tagging, and character sanitization.

The problem is that regex is a cat-and-mouse game. It misses "disregard prior directives" while looking for "ignore instructions." It fails entirely on multi-language exploits. Once an Agent has tool access (shell, DB), a single missed semantic variation becomes an RCE.

So I built Prompt Inspector. It is a semantic detection engine designed to move beyond blacklists.

The core deal:

Vector-based detection: Instead of keywords, we use embeddings to map prompts. It catches the intent of an injection, even if the phrasing is unique or translated.

Self-evolving loop: Borderline cases trigger an async LLM review. If it is a new attack pattern, the system automatically extracts the embedding and updates the vector database. It learns from new exploits.

Decoupled by design: It returns a confidence score rather than a hard block. The developer keeps full control over the execution routing.

Pluggable: Started with Google’s latest embeddings, but the architecture allows for custom-deployed models to avoid vendor lock-in.

Tech-stack: FastAPI, Vector Database, Google Embedding models, and an LLM-in-the-loop reviewer.

I’m currently offering free credits for early testers and open-source projects. I’d love to hear how you guys are handling tool-calling security beyond basic prompt engineering.

Live at: https://promptinspector.io

Tell HN: iPhone 6s still getting security updates

4•uticus•3h ago•2 comments

Why I'm moving away from Regex for LLM Agent security

2•aunicall•4h ago•0 comments

Ask HN: Have you successfully treated forward head posture ("nerd neck")?

41•trashymctrash•16h ago•30 comments

Ask HN: What was it like for programmers when spreadsheets became ubiquitous?

7•yodaiken•8h ago•7 comments

I built a platform to help developers find collaborators for new projects

3•deiv2002•11h ago•0 comments

How not to fork an open source project

5•jsattler•12h ago•0 comments

Toolpack SDK, an Open Source TypeScript SDK for Building AI-Powered Applications

2•sajeerzeji•9h ago•1 comments

Prompt to make Claude more autonomous in web dev

4•louison11•10h ago•1 comments

Claude broke a ZIP password in a smart way

7•jgrahamc•10h ago•2 comments

Ask HN: How do you use Coding Agents/CLIs out of coding?

4•arbayi•15h ago•5 comments

I traced $2B in nonprofit grants for Meta and Age Verification lobbying

89•theseusares•1d ago•20 comments

Ask HN: Why can't we just make more RAM?

23•chatmasta•1d ago•21 comments

Tell HN: Apple development certificate server seems down?

109•strongpigeon•4d ago•39 comments

MiniMax M2.5 is trained by Claude Opus 4.6?

10•Orellius•1d ago•10 comments

Ask HN: Got cancer, a new job,new boss in less than a year What do I do now?

19•Goleniewski•1d ago•17 comments

Ask HN: Would this eliminate bots for good?

2•piratesAndSons•14h ago•11 comments

Ask HN: 100k/year individual token usage?

7•alecsmart1•23h ago•3 comments

Ask HN: What's your biggest pain point when joining a new developer team?

8•KevStatic•1d ago•15 comments

Ask HN: Why have co-ops never played a major role in tech?

13•AbstractH24•1d ago•7 comments

Generate tests from GitHub pull requests

7•Aamir21•1d ago•3 comments

X is selling existing users' handles

197•hac•3d ago•91 comments

Ask HN: Is there prior art for this rich text data model?

5•chrisecker•1d ago•2 comments

Ask HN: Is Claude down again?

86•coderbants•3d ago•73 comments

Ask HN: Has anyone built an AI agent that spends real money?

3•xodn348•1d ago•4 comments

AI, Human Cognition and Knowledge Collapse – Daren Acemoglu

3•aanet•1d ago•3 comments

Ask HN: Looking for a job after layoff and burnout. What should I focus on

6•jacAtSea•16h ago•10 comments

Looking for Partner to Build Agent Memory (Zig/Erlang)

6•kendallgclark•2d ago•8 comments

Enabling Media Router by default undermines Brave's privacy claims

5•noguff•2d ago•2 comments

Instagram Ending Encrypted DMs

6•01-_-•1d ago•1 comments

Claude 4.6 Opus can recite Linux's list.h

25•itzmetanjim•2d ago•4 comments