frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Shall I implement it? No

https://gist.github.com/bretonium/291f4388e2de89a43b25c135b44e41f0
323•breton•1h ago•120 comments

Malus – Clean Room as a Service

https://malus.sh
915•microflash•8h ago•359 comments

Bubble Sorted Amen Break

https://parametricavocado.itch.io/amen-sorting
219•eieio•5h ago•75 comments

Reversing memory loss via gut-brain communication

https://med.stanford.edu/news/all-news/2026/03/gut-brain-cognitive-decline.html
172•mustaphah•5h ago•48 comments

ATMs didn't kill bank teller jobs, but the iPhone did

https://davidoks.blog/p/why-the-atm-didnt-kill-bank-teller
271•colinprince•7h ago•324 comments

Innocent woman jailed after being misidentified using AI facial recognition

https://www.grandforksherald.com/news/north-dakota/ai-error-jails-innocent-grandmother-for-months...
198•rectang•1h ago•103 comments

The Met releases high-def 3D scans of 140 famous art objects

https://www.openculture.com/2026/03/the-met-releases-high-definition-3d-scans-of-140-famous-art-o...
178•coloneltcb•6h ago•35 comments

Document poisoning in RAG systems: How attackers corrupt AI's sources

https://aminrj.com/posts/rag-document-poisoning/
22•aminerj•8h ago•8 comments

Forcing Flash Attention onto a TPU and Learning the Hard Way

https://archerzhang.me/forcing-flash-attention-onto-a-tpu
21•azhng•4d ago•2 comments

Bringing Chrome to ARM64 Linux Devices

https://blog.chromium.org/2026/03/bringing-chrome-to-arm64-linux-devices.html
28•ingve•2h ago•30 comments

Show HN: OneCLI – Vault for AI Agents in Rust

https://github.com/onecli/onecli
105•guyb3•5h ago•37 comments

Runners who churn butter on their runs

https://www.runnersworld.com/news/a70683169/how-to-make-butter-while-running/
53•randycupertino•3h ago•24 comments

Launch HN: IonRouter (YC W26) – High-throughput, low-cost inference

https://ionrouter.io
29•vshah1016•3h ago•13 comments

WolfIP: Lightweight TCP/IP stack with no dynamic memory allocations

https://github.com/wolfssl/wolfip
73•789c789c789c•6h ago•7 comments

An old photo of a large BBS (2022)

https://rachelbythebay.com/w/2022/01/26/swcbbs/
131•xbryanx•2h ago•93 comments

Dolphin Progress Release 2603

https://dolphin-emu.org/blog/2026/03/12/dolphin-progress-report-release-2603/
281•BitPirate•13h ago•47 comments

Converge (YC S23) Is Hiring a Founding Platform Engineer (NYC, Onsite)

https://www.runconverge.com/careers/founding-platform-engineer
1•thomashlvt•5h ago

Big data on the cheapest MacBook

https://duckdb.org/2026/03/11/big-data-on-the-cheapest-macbook
275•bcye•10h ago•241 comments

US private credit defaults hit record 9.2% in 2025, Fitch says

https://www.marketscreener.com/news/us-private-credit-defaults-hit-record-9-2-in-2025-fitch-says-...
177•JumpCrisscross•9h ago•304 comments

Show HN: Understudy – Teach a desktop agent by demonstrating a task once

https://github.com/understudy-ai/understudy
67•bayes-song•5h ago•19 comments

Show HN: Axe – A 12MB binary that replaces your AI framework

https://github.com/jrswab/axe
122•jrswab•8h ago•85 comments

Show HN: Detect any object in satellite imagery using a text prompt

https://www.useful-ai-tools.com/tools/satellite-analysis-demo/
6•eyasu6464•4d ago•1 comments

Show HN: OpenClaw-class agents on ESP32 (and the IDE that makes it possible)

https://pycoclaw.com/
4•pycoclaw•50m ago•1 comments

Are LLM merge rates not getting better?

https://entropicthoughts.com/no-swe-bench-improvement
88•4diii•10h ago•95 comments

The Cost of Indirection in Rust

https://blog.sebastiansastre.co/posts/cost-of-indirection-in-rust/
74•sebastianconcpt•3d ago•31 comments

The Road Not Taken: A World Where IPv4 Evolved

https://owl.billpg.com/ipv4x/
40•billpg•6h ago•68 comments

NASA's DART spacecraft changed an asteroid's orbit around the sun

https://www.sciencenews.org/article/spacecraft-changed-asteroid-orbit-nasa
91•pseudolus•3d ago•58 comments

Full Spectrum and Infrared Photography

https://timstr.website/blog/fullspectrumphotography.html
40•alter_igel•4d ago•22 comments

Long Overlooked as Crucial to Life, Fungi Start to Get Their Due

https://e360.yale.edu/features/fungi-kingdom
70•speckx•9h ago•18 comments

DDR4 Sdram – Initialization, Training and Calibration

https://www.systemverilog.io/design/ddr4-initialization-and-calibration/
48•todsacerdoti•2d ago•13 comments
Open in hackernews

Document poisoning in RAG systems: How attackers corrupt AI's sources

https://aminrj.com/posts/rag-document-poisoning/
22•aminerj•8h ago
I'm the author. Repo is here: https://github.com/aminrj-labs/mcp-attack-labs/tree/main/lab...

The lab runs entirely on LM Studio + Qwen2.5-7B-Instruct (Q4_K_M) + ChromaDB — no cloud APIs, no GPU required, no API keys.

From zero to seeing the poisoning succeed: git clone, make setup, make attack1. About 10 minutes.

Two things worth flagging upfront:

- The 95% success rate is against a 5-document corpus (best case for the attacker). In a mature collection you need proportionally more poisoned docs to dominate retrieval — but the mechanism is the same.

- Embedding anomaly detection at ingestion was the biggest surprise: 95% → 20% as a standalone control, outperforming all three generation-phase defenses combined. It runs on embeddings your pipeline already produces — no additional model.

All five layers combined: 10% residual.

Happy to discuss methodology, the PoisonedRAG comparison, or anything that looks off.

Comments

sidrag22•56m ago
> Low barrier to entry. This attack requires write access to the knowledge base,

this is the entire premise that bothers me here. it requires a bad actor with critical access, it also requires that the final rag output doesn't provide a reference to the referenced result. Seems just like a flawed product at that point.

sandermvanvliet•37m ago
If you think about this in the context of systems that ingest content from third party systems then this attack becomes more feasible.

But then, if you’re inside the network you’ve already overcome many of the boundaries

SlinkyOnStairs•33m ago
> it requires a bad actor with critical access

This isn't particularly hard. Lots and lots of these tools take from the public internet. There's already plenty of documented explanes of Google's AI summary being exploited in a structurally similar way.

For what it concerns internal systems, getting write access to documents isn't hard either. Compromising some workers is easy. Especially as many of them will be using who knows what AI systems to write these documents.

> it also requires that the final rag output doesn't provide a reference to the referenced result.

RAG systems providing a reference is nearly moot. If the references have to be checked; If the "Generation" cannot be trusted to be accurate and not hallucinate a bunch of bullshit, then you need to check every single time, and the generation part becomes pointless. Might as well just include a verbatim snippet.

zenoprax•12m ago
"bad actor" can now be "ignorant employee running AI agents on their laptop".

Threats from incompetence or ignorance will be multiplied by 'X' over 'Y' years as AI proliferates. Unsupervised AI agents and context poisoning will spiral things out of control in any environment.

I'm interested in the effect of this with respect to AI-generated/assisted documentation and the recycling of that alongside the source-code back into the models.

malfist•7m ago
Almost like defense in depth is key to good security. GP is ignoring that a truffle defense is only good until the first person is tricked
robutsume•27m ago
The "requires write access" framing undersells the risk. Most production RAG pipelines don't ingest from a single curated database — they crawl Confluence, shared drives, Slack exports, support tickets. In a typical enterprise, hundreds of people have write access to those sources without anyone thinking of it as "write access to the knowledge base."

The PoisonedRAG paper showing 90% success at millions-of-documents scale is the scary part. The vocabulary engineering approach here is basically the embedding equivalent of SEO — you're just optimizing for cosine similarity instead of PageRank. And unlike SEO, there's no ecosystem of detection tools yet.

I'd love to see someone test whether document-level provenance tracking (signing chunks with source metadata and surfacing that to the user) actually helps in practice, or if people just ignore it like they ignore certificate warnings.

alan_sass•6m ago
I've seen these data poisoning attacks from multiple perspectives lately (mostly from): SEC data ingestion + public records across state/federal databases.

I believe it is possible to reduce the data poisoning from these sources by applying a layered approach like the OP, but I believe it needs many more dimensions with scoring to model true adversaries with loops for autonomous quarantine->processing->ingesting->verification->research->continue to verification or quarantine->then start again for all data that gets added after the initial population.

Also, for: "1. Map every write path into your knowledge base. You can probably name the human editors. Can you name all the automated pipelines — Confluence sync, Slack archiving, SharePoint connectors, documentation build scripts? Each is a potential injection path. If you can’t enumerate them, you can’t audit them."

I recommend scoring for each source with different levels of escalation for all processes from official vs user-facing sources. That addresses issues starting from the core vs allowing more access from untrusted sources.

alan_sass•2m ago
I think an interesting thing to pay attention to soon is how there are networks of engagement farming cluster accounts on X that repost/like/manipulate interactions on their networks of accounts, and X at large to generate xyz.

There have been more advanced instances that I've noticed where they have one account generating response frameworks of text from a whitepaper, or other source/post, to re-distribute the content on their account as "original content"...

But then that post gets quoted from another account, with another LLM-generated text response to further amplify the previous text/post + new LLM text/post.

I believe that's where the world gets scary when very specific narrative frameworks can be applied to any post, that then gets amplified across socials.