frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

A Developer Accidentally Found CSAM in AI Data. Google Banned Him for It

https://www.404media.co/a-developer-accidentally-found-csam-in-ai-data-google-banned-him-for-it/
32•markatlarge•59m ago

Comments

bsowl•37m ago
More like "A developer accidentally uploaded child porn to his Google Drive account and Google banned him for it".
jkaplowitz•27m ago
The penalties for unknowingly possessing or transmitting child porn are far too harsh, both in this case and in general (far beyond just Google's corporate policies).

Again, to avoid misunderstandings, I said unknowingly - I'm not defending anything about people who knowingly possess or traffic in child porn, other than for the few appropriate purposes like reporting it to the proper authorities when discovered.

deltoidmaximus•33m ago
Back when the first moat creation gambit for AI failed (that they were creating SkyNet so the government needs to block anyone else from working on SkyNet since only OpenAI can be trusted to control it not just any rando) they moved onto the safety angle with the same idea. I recall seeing an infographic that all the major players were signed onto some kind of safety pledge, Meta, OpenAI, Microsoft, etc. Basically they didn't want anyone else training on the whole world's data because only they could be trusted to not do nefarious things with it. The infographic had a statement about not training on CSAM and revenge porn and the like but the corpospeak it was worded in made it sound like they were promising not to do it anymore, not that they never did.

I've tried to find this graphic against several times over the years but it's either been scrubbed from the internet or I just can't remember enough details to find it. Amusingly, it only just occurred to me that maybe I should ask ChatGPT to help me find it.

jsheard•30m ago
> The infographic had a statement about not training on CSAM and revenge porn and the like but the corpospeak it was worded in made it sound like they were promising not to do it anymore, not that they never did.

We know they did, an earlier version of the LAION dataset was found to contain CSAM after everyone had already trained their image generation models on it.

https://www.theverge.com/2023/12/20/24009418/generative-ai-i...

jsnell•29m ago
As a small point of order, they did not get banned for "finding CSAM" like the outrage- and clickbait title claims. They got banned for uploading child porn to Google Drive. They did not find it, and them later reporting the data set to an appropriate organization is not why they got banned.
jeffbee•27m ago
Literally every headline that 404 media has published about subjects I understand first-hand has been false.
jfindper•13m ago
>They got banned for uploading child porn to Google Drive

They uploaded the full "widely-used" training dataset, which happened to include CSAM (child sexual abuse material).

While the title of the article is not great, your wording here implies that they purposefully uploaded some independent CSAM pictures, which is not accurate.

giantg2•21m ago
This raises an interesting point. Do you need to train models using CSAM so that the model can self-enforce restrictions on CSAM? If so, I wonder what moral/ethical questions this brings up.
jsheard•16m ago
It's a delicate subject but not an unprecedented one. Automatic detection of already known CSAM images (as opposed to heuristic detection of unknown images) has been around for much longer than AI, and for that service to exist someone has to handle actual CSAM before it's reduced to a visual hash in a database.
amarcheschi•17m ago
Just a few days ago I was doing some low paid (well, not so low) Ai classification task - akin to mechanical turk ones - for a very big company and was - involuntarily, since I guess they don't review them before showing - shown an ai image by the platform depicting a naked man and naked kid. though it was more barbie like than anything else. I didn't really enjoy the view tbh, contacted them but got no answer back
gillesjacobs•8m ago
[delayed]

Show HN: Alzheimer's conversational AI agent (ElevenLabs 3 hours hackathon)

https://github.com/gyoridavid/relief-alzheimer-conversational-agent
1•gyoridavid•59s ago•0 comments

Show HN: File converter with SHA-256 audit trails and downloadable

https://converter-lmxu.onrender.com
1•reman--•1m ago•0 comments

Postgres 18 New Default for Data Checksums and How to Deal with Upgrades

https://www.crunchydata.com/blog/postgres-18-new-default-for-data-checksums-and-how-to-deal-with-...
1•enz•2m ago•0 comments

Gang rape and sexual slavery widely employed by government and rebels in the DRC

https://www.reuters.com/investigates/special-report/congo-security-rapes
1•hashim•3m ago•0 comments

Alivenet

https://vvesh.de
1•pryncevv•3m ago•0 comments

Test Smart Underwear to Build a Human Flatus Atlas

https://redcap.umaryland.edu/surveys/?s=JY4WE33E9XYDCHC4
1•jhfdbkofdchk•5m ago•1 comments

Vote for the web features you want to see

https://web.dev/blog/upvote-features
1•xnx•5m ago•0 comments

Show HN: Git-Scope – A Fast TUI Dashboard for Managing Multiple Git Repos

https://bharath-code.github.io/git-scope/
1•iam_pbk•6m ago•0 comments

Time After Time

https://thinkhuman.com/time-after-time/
1•jamesgill•6m ago•0 comments

The risk of developing cancer and frequency of alcohol consumption behaviors

https://www.sciencedirect.com/science/article/abs/pii/S1877782125002164
2•bikenaga•8m ago•0 comments

Queries

https://www.futilitycloset.com/2025/11/29/queries/
1•surprisetalk•8m ago•0 comments

Benevolent Dictator for Now

https://roc-lang.org/bdfn
1•surprisetalk•8m ago•0 comments

Pop _OS 24.04 LTS has been released

https://system76.com/pop/download/
1•bitbasher•8m ago•0 comments

Days since last GitHub incident

https://github-incidents.pages.dev/
3•AquiGorka•8m ago•0 comments

Notes on Sorted Data

https://amit.prasad.me/blog/sorted-data
1•surprisetalk•9m ago•0 comments

The Triumph of Logical English

https://worksinprogress.co/issue/the-logical-triumph-of-english/
1•surprisetalk•9m ago•0 comments

Show HN: I made a cozy showcase to find the best game jam games of the month

https://indiehunt.xyz
1•adamgusky•9m ago•0 comments

Show HN: AI Copilot for LibreOffice Writer

https://librethinker.com/
1•mmarian•10m ago•0 comments

Waterfox release 6.6.6 – Privacy hardening

https://www.waterfox.com/releases/6.6.6/
1•tmtvl•11m ago•0 comments

The Plan Is to Make the Internet Worse. Forever [video]

https://www.youtube.com/watch?v=7wE8G-d7SnY
1•tempfile•12m ago•0 comments

The Deadweight Loss of Entertainment

https://moultano.wordpress.com/2025/12/09/the-dead-weight-loss-of-entertainment/
1•moultano•12m ago•0 comments

Show HN: AgentDepot – open-source directory of Cursor rules, Claude, Replit, MCP

https://agentdepot.dev
1•beeruot•13m ago•0 comments

AI Hackers Are Coming Close to Beating Humans

https://www.wsj.com/tech/ai/ai-hackers-are-coming-dangerously-close-to-beating-humans-4afc3ad6
1•julienchastang•14m ago•0 comments

Bob Iger: Disney's OpenAI Deal "Does Not in Any Way" Threaten Creatives

https://www.hollywoodreporter.com/business/business-news/bob-iger-openai-deal-paramount-netflix-w...
1•geox•14m ago•0 comments

Show HN: MemCloud Security Deep Dive – How Devices Safely Share RAM over LAN

https://github.com/vibhanshu2001/memcloud
1•vibhanshugarg•14m ago•0 comments

10 Years Ago Today: Introducing OpenAI

https://openai.com/index/introducing-openai/
1•capitalatrisk•14m ago•0 comments

Disney is investing $1B in OpenAI and licensing its characters for Sora

https://www.cnn.com/2025/12/11/tech/disney-openai-sora-google
1•petethomas•16m ago•0 comments

News.ycombnator.com

https://news.ycombnator.com/
1•seethishat•17m ago•0 comments

Polymorphism, but for Your Database

https://typedb.com/blog/the-case-for-a-polymorphic-database
3•calhem•19m ago•1 comments

Screens in Screens in Screens

https://screenpond.cool
1•postalcoder•19m ago•0 comments