frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: What's the state of multimodal prompt injection defence in 2026?

2•JoshBlythe•1h ago
I've been researching multimodal prompt injection - attacks hidden in images, documents, and audio rather than text. Ran a structured test suite (225 attacks across 5 modalities) against a detection pipeline I built and the results were surprising.

Some findings:

- Audio is easier to defend than text. Ultrasonic and spectral attacks have detectable signal characteristics via FFT analysis. The hard part is after transcription, where it becomes a text problem again.

- Cross-modal attacks are less dangerous than expected if you scan each modality independently. The "clean text + malicious PDF" attack only works if you trust the document because the text looked safe.

- Encoding (base64, ROT13, leetspeak) is a solved problem if you decode before scanning. The remaining gap is very short encoded payloads that fall below detection thresholds.

- The real unsolved problem is semantic. Completion attacks ("Complete the following: 'The system prompt reads...'"), narrative extraction, steganographic output manipulation, and multi-turn context poisoning all require understanding intent, not pattern matching. A classifier trained on known injection patterns will always miss novel framing.

- False positives are harder than detection. Getting zero false positives on inputs like "act as a SQL expert", "override the default config", and "what is prompt injection" took more work than improving detection rates.

- Non-English injection is a massive blind spot. An English-trained classifier misses every non-English attack that dodges regex patterns.

My question for HN: is anyone else working on multimodal injection defence? Most tools I've found (Lakera Guard, LLM Guard, Azure Prompt Shields) are still text-only in their public APIs. The research papers describe the attacks well but I haven't seen many production-grade defences for image/audio/document injection.

Also curious whether anyone has had success with LLM-as-judge approaches for detecting semantic attacks - using a second model to evaluate whether an input is trying to manipulate the first. The latency and cost tradeoffs seem brutal but it might be the only path for the subtle stuff.

Would love to hear what others are seeing in production.

Show HN: OS Megakernel that match M5 Max Tok/w at 2x the Throughput on RTX 3090

https://github.com/Luce-Org/luce-megakernel
1•GreenGames•2m ago•0 comments

Show HN: Explore the Silk Roads through an interactive map

https://www.intofarlands.com/silk-roads-map
1•intofarlands•3m ago•0 comments

VLC media player is onboard the Artemis mission

https://twitter.com/videolan/status/2041794439257944197
1•emptybits•3m ago•0 comments

Northeastern presentation to junior engineers in the age of AI

https://blog.marcua.net/2026/04/08/ai-agent-revolution-junior-software-engineers.html
1•speckx•4m ago•0 comments

Show HN: The Crab Games, a platform where agents compete in silly challenges

https://thecrabgames.com/
1•motrazilla•6m ago•0 comments

If Thomas Jefferson were alive today

https://shryn.ai/jefferson-public
1•erikraschke•6m ago•2 comments

Hugging Face moves safetensors to the PyTorch Foundation

https://huggingface.co/blog/safetensors-joins-pytorch-foundation
1•lysandre•6m ago•1 comments

Chilcy – Free AI tool for CSV insights

https://www.chilcy.com/
1•sajithfx•6m ago•0 comments

Free domain SEC scanner – DMARC, MTA-STS, subdomain takeover, credential leaks

https://www.mydomainrisk.com/
1•hughcox•7m ago•1 comments

Ambiguity Aversion: Why Unknown Probabilities Create Mispricing

https://philippdubach.com/posts/ambiguity-by-design/
1•7777777phil•7m ago•0 comments

Mnemo: Shareable typed agentic memory system with Bayesian belief updating

https://github.com/inforge-ai/mnemo-server
1•tompdavis•7m ago•1 comments

Wildlife Conservation Police Are Searching Flock Cameras for ICE

https://www.404media.co/floridas-wildlife-cops-are-searching-thousands-of-flock-cameras-for-ice/
2•lschueller•8m ago•0 comments

Trump is facing the biggest US humiliation since Vietnam

https://inews.co.uk/opinion/trump-biggest-us-humiliation-since-vietnam-4340617
4•doener•8m ago•0 comments

Project Glasswing – Anthropic has crossed a line

https://daveshap.substack.com/p/project-glasswing-anthropic-has-crossed
1•swolpers•8m ago•0 comments

Delivery is not delivery: timing, latency, and what SMS APIs don't show

https://blog.bridgexapi.io/delivery-is-not-delivery-timing-latency-and-what-sms-apis-don-t-show
1•Bridgexapi•9m ago•1 comments

Hacker News

https://news.ycombinator.com/news
1•avycado13•9m ago•0 comments

The Voorhees law of traffic: why the car you passed always returns

https://royalsocietypublishing.org/rsos/article/13/4/260310/481212/The-Voorhees-law-of-traffic-a-...
2•Jimmc414•12m ago•1 comments

Casio ABL-100 vs. Ollee Watch One

https://rz01.org/casio/
3•exitnode•13m ago•1 comments

Anthropic greps for 'Pi', 'OpenClaw' in prompts and blocks them

https://twitter.com/FlorianKluge/status/2041855675295318039
3•colinmarc•14m ago•0 comments

Backpressure in Agent-Driven Development

https://cyberepistemics.com/posts/20260311-backpressure-in-agent-driven-development/
2•will_wright•14m ago•0 comments

The Landscape of Agentic Coding

https://codagent.beehiiv.com/p/the-middle-agentic-path
2•paulcaplan•14m ago•1 comments

Google launched an AI dictation app that works offline

https://techcrunch.com/2026/04/07/google-quietly-releases-an-offline-first-ai-dictation-app-on-ios/
3•dragonsenseiguy•15m ago•0 comments

Milla Jovovich Built MemPalace – The Full Story

https://www.mempalace.tech/story
3•ianrahman•15m ago•0 comments

Reverse-engineering retrieval in decoder-only Transformers

https://github.com/tmaselko/paper-attncap
2•tmaselko•16m ago•1 comments

Codasip announces strategic pivot and divestiture

https://codasip.com/press-release/2026/04/08/codasip-announces-strategic-pivot-and-divestiture/
2•vogr•16m ago•0 comments

Microsoft Abruptly Terminates VeraCrypt Account, Halting Windows Updates

https://www.404media.co/microsoft-abruptly-terminates-veracrypt-account-halting-windows-updates/
4•donohoe•17m ago•0 comments

Cogito: Beautiful AI Markdown Editor for Mac

https://cogito.md
4•0xferruccio•17m ago•0 comments

A rigorous .md specification for AI Daemons

https://ai-daemons.com/spec/
3•mrbbk•18m ago•0 comments

Dario's Weird Race to the Top

https://davidbau.com/archives/2026/04/08/darios_weird_race_to_the_top.html
3•speckx•18m ago•0 comments

Espressif's New ESP32-S31: Dual-Core RISC-V with WiFi 6 and Gbit Ethernet

https://hackaday.com/2026/04/08/espressifs-new-esp32-s31-dual-core-risc-v-with-wifi-6-and-gbit-et...
2•alecco•18m ago•0 comments