frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Top Secret: Automatically filter sensitive information

https://thoughtbot.com/blog/top-secret
75•thunderbong•1d ago

Comments

fine_tune•5h ago
I'm no ruby expert, so forgive my ignorance, but it looks like a small "NER model" packaged as a string convince wrapper named `filter` that tries to filter out "sensitive info" on input strings.

I assume the NER model is small enough to run on CPU at less than 1s~ per pass at the trade off of storage per instance (1s is fast enough in dev, in prod with long convos - that's a lot of inference time), generally a neat idea though.

Couple questions;

- NER doesn't often perform well in different domains, how accurate is the model?

- How do you actually allocate compute/storage for inferring on the NER model?

- Are you batching these `filter` calls or is it just sequential 1 by 1 calls

neilv•4h ago
When I had to implement "deidentification" for a kind of sensitive safety reporting, an LLM would've been a good way to augment the approaches I used.

Today, if I had to do it, I'd probably throw multiple computer approaches at it, including LLM-based one, and take the union of those as the computer result, and check it against a human result. (If computer and human agree, that's a good sign; if they disagree, see why before the document goes where it needs to be deidentified.)

(In some kinds of flight safety reporting, any kind of personnel can submit a report about any observation related to safety. It gets very seriously handled and analyzed. There are also multiple ways in which the reporting parties are protected. There are situations in which some artifacts need to have identifying information redacted.)

dwa3592•3h ago
Oh hey! Good to see this. I built something similar in python a while ago.

Check it out: https://github.com/deepanwadhwa/zink

The shield functionality fits directly in your LLM workflow.

sbpayne•2h ago
This is great but it does not “prevent”; it reduces the chances of. NER is not 100% performant. It is very good in many cases, but use with caution!
wombatpm•1h ago
There is an extension for PostGres, https://postgresql-anonymizer.readthedocs.io that allows you to mask data by user or group at the schema level with the options to return full mask, partial mask or dummy data.

A visual history of Visual C++ (2017)

http://www.malsmith.net/blog/visual-c-visual-history/
16•rayanboulares•1h ago•2 comments

Show HN: Creao – Vibe coding product for founders

https://creao.ai/
12•north_creao•3h ago•0 comments

Show HN: JavaScript-free (X)HTML Includes

https://github.com/Evidlo/xsl-website
103•Evidlo•10h ago•43 comments

The theory and practice of selling the Aga cooker (1935) [pdf]

https://comeadwithus.wordpress.com/wp-content/uploads/2012/08/the-theory-and-practice-of-selling-the-aga-cooker.pdf
11•phpnode•2d ago•4 comments

Nitro: A tiny but flexible init system and process supervisor

https://git.vuxu.org/nitro/about/
162•todsacerdoti•9h ago•57 comments

The first Media over QUIC CDN: Cloudflare

https://moq.dev/blog/first-cdn/
187•kixelated•10h ago•89 comments

Google says it dropped the energy cost of AI queries by 33x in one year

https://arstechnica.com/ai/2025/08/google-says-it-dropped-the-energy-cost-of-ai-queries-by-33x-in-one-year/
35•ksec•1h ago•6 comments

Shader Academy: Learn computer graphics by solving challenges

https://shaderacademy.com/
65•pykello•2d ago•7 comments

I Run a Full Linux Desktop in Docker Just Because I Can

https://www.howtogeek.com/i-run-a-full-linux-desktop-in-docker-just-because-i-can/
58•redbell•3d ago•24 comments

Top Secret: Automatically filter sensitive information

https://thoughtbot.com/blog/top-secret
75•thunderbong•1d ago•5 comments

Japan city drafts ordinance to cap smartphone use at 2 hours per day

https://english.kyodonews.net/articles/-/59582
47•Improvement•2h ago•16 comments

Glyn: Type-safe PubSub and Registry for Gleam actors with distributed clustering

https://github.com/mbuhot/glyn
36•TheWiggles•6h ago•3 comments

FFmpeg 8.0

https://ffmpeg.org/index.html#pr8.0
756•gyan•13h ago•173 comments

Computer fraud laws used to prosecute leaking air crash footage to CNN

https://www.techdirt.com/2025/08/22/investigators-used-terrible-computer-fraud-laws-to-ensure-people-were-punished-for-leaking-air-crash-footage-to-cnn/
139•BallsInIt•4h ago•55 comments

Popular Japanese smartphone games have introduced external payment systems

https://english.kyodonews.net/articles/-/59689
102•anigbrowl•5h ago•48 comments

Developer gets 4 years for activating network "kill switch" to avenge his firing

https://arstechnica.com/tech-policy/2025/08/developer-gets-4-years-for-activating-network-kill-switch-to-avenge-his-firing/
15•Volundr•50m ago•6 comments

Why is this hard?

https://programmersstone.blog/posts/why-is-this-hard/
19•Bogdanp•2d ago•5 comments

My tips for using LLM agents to create software

https://efitz-thoughts.blogspot.com/2025/08/my-experience-creating-software-with_22.html
19•efitz•3h ago•4 comments

Bluesky Goes Dark in Mississippi over Age Verification Law

https://www.wired.com/story/bluesky-goes-dark-in-mississippi-age-verification/
101•BallsInIt•6h ago•39 comments

From M1 MacBook to Arch Linux: A month-long experiment that became permanenent

https://www.ssp.sh/blog/macbook-to-arch-linux-omarchy/
46•articsputnik•3d ago•60 comments

Transcribe music in abc with syntax highlighting

https://fugue-state.io/app?project=24024aab-22f1-43cc-abef-c1647cc59597
14•jonzudell•6h ago•5 comments

Launch HN: BlankBio (YC S25) - Making RNA Programmable

47•antichronology•12h ago•25 comments

LabPlot: Free, open source and cross-platform Data Visualization and Analysis

https://labplot.org/
206•turrini•19h ago•37 comments

Leaving Gmail for Mailbox.org

https://giuliomagnifico.blog/post/2025-08-18-leaving-gmail/
201•giuliomagnifico•11h ago•239 comments

The use of LLM assistants for kernel development

https://lwn.net/Articles/1032612/
9•Bogdanp•5h ago•1 comments

The issue of anti-cheat on Linux (2024)

https://tulach.cc/the-issue-of-anti-cheat-on-linux/
97•todsacerdoti•1d ago•185 comments

U.S. government takes 10% stake in Intel

https://www.cnbc.com/2025/08/22/intel-goverment-equity-stake.html
492•givemeethekeys•7h ago•552 comments

Mail Carriers Pause US Deliveries as Tariff Shift Sows Confusion

https://www.bloomberg.com/news/articles/2025-08-21/global-mail-services-halt-us-deliveries-ahead-of-de-minimis-end
115•voxadam•5h ago•74 comments

Closing the Nix gap: From environments to packaged applications for rust

https://devenv.sh/blog/2025/08/22/closing-the-nix-gap-from-environments-to-packaged-applications-for-rust/
51•domenkozar•12h ago•23 comments

It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)

https://hsivonen.fi/string-length/
152•program•22h ago•229 comments