frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Pseudonymizing sensitive data for LLMs without losing context

https://atticsecurity.com/en/blog/why-llms-hate-fake-data-token-proxy/
4•n00pn00p•5d ago

Comments

_zer0c00l_•5d ago
I have one (at least) fundamental concern about the approach - let's say I'm building an anti-fraud system that uses AI (through API), and maybe I'm asking AI whether my user totally+fraud@gmail.com is a potential fraudster. By masking this email address I'm sabotaging my own AI prompt - the AI cannot longer reason based on the facts that 1) the email is a free public email 2) the email says 'fraud' right in your face.
n00pn00p•5d ago
Valid point, the proxy has the option to always allow domain names through. You will lose some context always I fear. It should be used sparingly when you need a frontier model but also want to send sensitive data.
stuaxo•5d ago
You can do those as a sperate prompt.
dwa3592•5d ago
ooh nice. i built something exactly similar last year.

- https://github.com/deepanwadhwa/semi_private_chat

- https://github.com/deepanwadhwa/zink

n00pn00p•4d ago
oh sweet, that would have saved allot of time!
bennettdixon•4d ago
Nice write up, one thing that stood out is the V2 to V3 jump. One of my clients is integrating personal wellness & AI, and we took a slightly different route. The health data and personal data live in separate dbs with an encrypted mapping layer between. This way the model only sees health context attached to a unique pseudo-user level session. Your problem almost seems harder, because the PII is the signal/context. One challenge we are facing is re-identification, e.g rich-health profiles being identifiable in themselves.

Curious if you have thought about that side of things with your V3 implementation?

n00pn00p•4d ago
That's a great point. Because my tool is designed for security operations and triage, the context (like knowing an IP is from Hetzner, or a domain is a known burner) is actually the signal the LLM needs to do its job. I made a conscious trade-off to allow some contextual metadata to pass through to preserve utility.

Since I'm based in the Netherlands, I look at this strictly through the lens of the Dutch privacy law (the AVG). Under the AVG, there's a hard line between anonymized data and pseudonymized data. Because of the exact 'mosaic effect' you mentioned, pseudonymized data is legally still treated as personal data. So, the re-identification risk is an accepted reality.

Essentialy i treat the tool as an extra effort to reduce PII leaks. But its not foolproof against the context clues.

glitchnsec•3d ago
This is really cool - I'm still in V2 with NER for redacting PII before sending to model BUT that was just on simple email analysis. I bet most teams building for security with AI haven't addressed this! Thanks for sharing!

John Ternus to become Apple CEO

https://www.apple.com/newsroom/2026/04/tim-cook-to-become-apple-executive-chairman-john-ternus-to...
1748•schappim•12h ago•893 comments

Anthropic says OpenClaw-style Claude CLI usage is allowed again

https://docs.openclaw.ai/providers/anthropic
165•jmsflknr•5h ago•91 comments

A Roblox cheat and one AI tool brought down Vercel's platform

https://webmatrices.com/post/how-a-roblox-cheat-and-one-ai-tool-brought-down-vercel-s-entire-plat...
136•bishwasbh•4h ago•59 comments

Louis Zocchi, inventor of the d100, has died

https://icv2.com/articles/news/view/62176/r-i-p-louis-zocchi-the-godfather-dice
31•sgbeal•2h ago•7 comments

The Beauty of Bonsai Styles

https://longwoodgardens.org/blog/2023-05-17/beauty-bonsai-styles
53•lagniappe•4h ago•14 comments

Salmon exposed to cocaine and its main byproduct roam more widely

https://www.science.org/content/article/cocaine-pollution-gives-salmon-wanderlust
21•1659447091•3h ago•5 comments

How to make a fast dynamic language interpreter

https://zef-lang.dev/implementation
161•pizlonator•8h ago•21 comments

Show HN: Mediator.ai – Using Nash bargaining and LLMs to systematize fairness

https://mediator.ai/
58•sanity•17h ago•26 comments

Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving

https://qwen.ai/blog?id=qwen3.6-max-preview
618•mfiguiere•18h ago•328 comments

How a subsea cable is repaired

https://www.onesteppower.com/post/subsea-cable-repair
62•slicktux•4d ago•12 comments

Kimi vendor verifier – verify accuracy of inference providers

https://www.kimi.com/blog/kimi-vendor-verifier
253•Alifatisk•14h ago•21 comments

Types and Neural Networks

https://www.brunogavranovic.com/posts/2026-04-20-types-and-neural-networks.html
16•bgavran•2h ago•3 comments

A mad undertaking: An undefinitive guide to the Aadam Jacobs collection

https://aadamjacobscollection.org/
11•wise_blood•2h ago•1 comments

Ternary Bonsai: Top Intelligence at 1.58 Bits

https://prismml.com/news/ternary-bonsai
140•nnx•3d ago•40 comments

Jujutsu megamerges for fun and profit

https://isaaccorbrey.com/notes/jujutsu-megamerges-for-fun-and-profit
222•icorbrey•11h ago•108 comments

Using Changesets in a polyglot monorepo

https://luke.hsiao.dev/blog/changesets-polyglot-monorepo/
8•lwhsiao•2h ago•3 comments

Air is full of DNA

https://www.nature.com/articles/d41586-026-01099-2
89•howrude•2d ago•18 comments

ggsql: A Grammar of Graphics for SQL

https://opensource.posit.co/blog/2026-04-20_ggsql_alpha_release/
409•thomasp85•19h ago•80 comments

Quantum Computers Are Not a Threat to 128-Bit Symmetric Keys

https://words.filippo.io/128-bits/
220•hasheddan•16h ago•78 comments

Soul Player C64 – A real transformer running on a 1 MHz Commodore 64

https://github.com/gizmo64k/soulplayer-c64
125•adunk•12h ago•33 comments

Japan's cherry blossom database, 1,200 years old, has a new keeper

https://www.nytimes.com/2026/04/17/climate/japan-cherry-blossom-database-scientist.html
98•caycep•3d ago•12 comments

Brussels launched an age checking app. Hackers took 2 minutes to break it

https://www.politico.eu/article/eu-brussels-launched-age-checking-app-hackers-say-took-them-2-min...
218•axbyte•1d ago•114 comments

Monero Community Crowdfunding System

https://ccs.getmonero.org/ideas/
89•OsrsNeedsf2P•11h ago•56 comments

MNT Reform is an open hardware laptop, designed and assembled in Germany

http://mnt.stanleylieber.com/reform/
8•speckx•18h ago•0 comments

Modern Rendering Culling Techniques

https://krupitskas.com/posts/modern_culling_techniques/
143•krupitskas•2d ago•34 comments

All phones sold in the EU to have replaceable batteries from 2027

https://www.theolivepress.es/spain-news/2026/04/20/eu-to-force-replaceable-batteries-in-phones-an...
1222•ramonga•19h ago•1017 comments

Bullshit About Bullshit Machines [pdf]

https://aphyr.com/data/posts/411/the-future-of-everything-is-lies.pdf
14•hedayet•2d ago•2 comments

Prediction markets are breaking the news and becoming their own beat

https://www.niemanlab.org/2026/04/prediction-markets-are-breaking-the-news-and-becoming-their-own...
39•gnabgib•7h ago•42 comments

Kefir C17/C23 Compiler

https://sr.ht/~jprotopopov/kefir/
151•conductor•3d ago•15 comments

WebUSB Extension for Firefox

https://github.com/ArcaneNibble/awawausb
239•tuananh•20h ago•208 comments