frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Minimalist editor that lives in browser, stores everything in the URL

https://github.com/antonmedv/textarea
164•medv•2h ago•61 comments

Show HN: Vibium – Browser automation for AI and humans, by Selenium's creator

https://github.com/VibiumDev/vibium
160•hugs•4h ago•65 comments

Show HN: Just Fucking Use Cloudflare – A satirical guide to the CF stack

https://justfuckingusecloudflare.com
2•MyNameIsTito•13m ago•0 comments

Show HN: A local-first, reversible PII scrubber for AI workflows

https://medium.com/@tj.ruesch/a-local-first-reversible-pii-scrubber-for-ai-workflows-using-onnx-a...
4•tjruesch•5h ago•0 comments

Show HN: WebPtoPNG – I built a WebP to PNG tool, everything runs in the browser

https://webptopng.cc/
3•akseli_ukkonen•1h ago•2 comments

Show HN: Native iOS version of The Brutalist Report to clean up reading news

https://apps.apple.com/us/app/brutalist-report/id6756546583
6•cylo•1h ago•0 comments

Show HN: Lamp Carousel – DIY Kinetic Sculpture Powered by Lamp Heat

https://evan.widloski.com/posts/spinners/
2•Evidlo•1h ago•0 comments

Show HN: LazyPromise = Observable – Signals

https://github.com/lazy-promise/lazy-promise
27•ivan7237d•5d ago•5 comments

Show HN: Turn raw HTML into production-ready images for free

https://html2png.dev
132•alvinunreal•20h ago•76 comments

Show HN: LoongArch Userspace Emulator

https://github.com/libriscv/libloong
5•fwsgonzo•9h ago•0 comments

Show HN: CineCLI – Browse and torrent movies directly from your terminal

https://github.com/eyeblech/cinecli
325•samsep10l•1d ago•111 comments

Show HN: Elfpeek – A tiny interactive ELF binary inspector in C

https://github.com/Oblivionsage/elfpeek
4•oblivionsage•4h ago•3 comments

Show HN: Kapso – WhatsApp for developers

https://kapso.ai/
31•aamatte•1d ago•18 comments

Show HN: Regex Man - short 3D regex game (desktop web)

https://bcjordan.com/regexman/
2•bcjordan•6h ago•0 comments

Show HN: An open-source anonymizer tool to replace PII in PostgreSQL databases

https://github.com/pgEdge/pgedge-anonymizer
3•pgedge_postgres•6h ago•0 comments

Show HN: Master Economics Through Interactive Simulations

https://julienreszka.github.io/economic-simulator/
2•julienreszka•6h ago•0 comments

Show HN: I built an open-source Linux-capable single-board computer with DDR3

https://github.com/cheyao/icepi-sbc
5•Cyao•7h ago•4 comments

Show HN: CodinIT, local open-source Lovable alternative (Electron desktop app)

https://github.com/codinit-dev/codinit-dev
18•Gerome24•4d ago•2 comments

Show HN: Jmail – Google Suite for Epstein files

https://www.jmail.world
1535•lukeigel•4d ago•352 comments

Show HN: Yapi – FOSS terminal API client for power users

https://yapi.run/blog/what-is-yapi
48•jamiepond•2d ago•17 comments

Show HN: The Language Inside C++ [video]

2•lihaciudaniel2•10h ago•0 comments

Show HN: Books mentioned on Hacker News in 2025

https://hackernews-readings-613604506318.us-west1.run.app
605•seinvak•3d ago•212 comments

Show HN: HN Wrapped 2025 - an LLM reviews your year on HN

https://hn-wrapped.kadoa.com?year=2025
308•hubraumhugo•4d ago•153 comments

Show HN: Semantic Coverage – A tool to visualize RAG blind spots using UMAP

https://github.com/aashirpersonal/semantic-coverage
3•aashirpersonal•11h ago•1 comments

Show HN: I built a tool that creates videos out of React code

https://github.com/outscal/video-generator
2•mayankkgrover•11h ago•0 comments

Show HN: Netrinos – A keep it simple Mesh VPN for small teams

https://netrinos.com
92•pcarroll•5d ago•65 comments

Show HN: Epstein Files and images (4000 .png files)

https://epstein-files-browser.vercel.app
7•Gerome24•5h ago•0 comments

Show HN: Rust/WASM lighting data toolkit – parses legacy formats, generates SVGs

https://eulumdat.icu
51•holg•3d ago•5 comments

Show HN: RenderCV – Open-source CV/resume generator, YAML to PDF

https://github.com/rendercv/rendercv
97•sinaatalay•3d ago•41 comments

Show HN: Cosmofy – bundle your Python code for Linux/Windows/MacOS

https://github.com/metaist/cosmofy
9•metaist•16h ago•1 comments
Open in hackernews

Show HN: A local-first, reversible PII scrubber for AI workflows

https://medium.com/@tj.ruesch/a-local-first-reversible-pii-scrubber-for-ai-workflows-using-onnx-and-regex-e9850a7531fc
4•tjruesch•5h ago
Hi HN,

I’m one of the maintainers of Bridge Anonymization. We built this because the existing solutions for translating sensitive user content are insufficient for many of our privacy-concious clients (Governments, Banks, Healthcare, etc.).

We couldn't send PII to third-party APIs, but standard redaction destroyed the translation quality. If you scrub "John" to "[PERSON]", the translation engine loses gender context (often defaulting to masculine), which breaks grammatical agreement in languages like French or German.

So we built a reversible, local-first pipeline for Node.js/Bun. Here is how we implemented the tricky parts:

0. The Mapping

We use XML-like tags with ID’s that uniquely identify the PII, `<PII type=”PERSON” id=”1”>`. Translation models and the systems around them work with XML data structures since the dawn of Computer Aided Translation tools, so this improves compatibility with existing workflows and systems. A `PIIMap` is stored locally for rehydration after translation (AES-256-GCM-encrypted by default).

1. Hybrid Detection Engine

Obviously neither Regex nor NER was enough on its own.

- Structured PII: We use strict Regex with validation checksums for things like IBANs (Mod-97) and Credit Cards (Luhn). - Soft PII: For names and locations, we run a quantized `xlm-roberta` model via `onnxruntime-node` directly in the process. This lets us avoid a Python sidecar while keeping the package ‘lightweight’ (still ~280MB for the quantized model, but acceptable for desktop environments).

2. The "Hallucination" Guard (Fuzzy Rehydration)

LLMs often "mangle" the XML placeholders during translation (e.g., turning `<PII id="1"/>` into `< PII id = « 1 » >`). We implemented a Fuzzy Tag Matcher that uses flexible regex patterns to detect these artefacts. It identifies the tag even if attributes are reordered or quotes are changed, ensuring we can always map the token back to the original encrypted value.

3. Semantic Masking

We are currently working on "Semantic Masking"—adding context to the PII tag (like `<PII type="PERSON" gender="female" id="1" />` ) to preserve (gender) context for the translation. For now, we are relying on a lightweight lookup-table approach to avoid the overhead of a second ML model or the hassle of fine tuning. So far this works nicely for most use cases.

The code is MIT licensed. I’d love to hear how others are handling the "context loss" problem in privacy-preserving NLP pipelines! I think this could quite easily be generalized to other LLM applications as well.