These numbers are mind-boggling, and while I understand that a "few (extremely) bad apples" are probably responsible for an outsized amount of production, AND that AI-generated imagery is flooding the zone disproportionate to the amount of actual human children being physically harmed, it's still absolutely wild to me that we collectively are producing and consuming so much of this content, despite it being largely universally considered essentially the most abhorrent thing possible.
What would fixing this at the root cause even start to begin? How do we apply whatever combination of therapeutic intervention or further societal pressure or whatever might work to reduce the incidence of people having these urges, exploring them, feeding them, and sometimes acting on them? We see signs in every airport bathroom telling us to look for signs of trafficking. Trafficking intervention training is a huge deal in the travel industry in general. There are early intervention and detection systems for social workers and case workers.
But has anyone spent any real time looking at this from the other side: the side of the offender? I imagine there's research on the typical chain of how someone gets "onboarded" here: it probably starts with some early abuse, or if not that, early exposure or early curiosity, and then snowballs from there. I'm just thinking out loud about how large the magnitude of the problem is on the offender side if we're talking about this volume of images, and how we might be able to evaluate things from the "ounce of prevention worth a pound of cure" side of things, because damn is this depressing.
I would be interested in statistics related to the percent of adults who would be considered child predators. I have zero scope on how large this issue is by percent of population.
If we're talking about 3% of everyone who is sexually attracted to children, that's one thing, but if it's .0000001% then the issue really is just the producers of content.
Does anyone here know of any studies or statistics? My basic googling hasn't really turned up anything trustworthy.
For some, "child predators" are those who do harmful things to toddlers.
For others, "child predators" are anyone who you want to accuse of it, like in this story: https://www.the-independent.com/news/world/americas/crime/ke...
https://scispace.com/pdf/how-common-is-men-s-self-reported-s...
Ghastly.
> What does it say about us, as a society, or just as _humans_, where the scale and magnitude of this problem is so great and only growing?
That the people in power have too much power and they get away with it often enough that there is actual money to be made supplying them.
i am so sick of AI slop writing..
>Built with love and ~25 000 tokens. Conceived and directed by a human. Written by AI.
I appreciate the transparency, although it is at the bottom.
i should take a break from the internet, the past couple of weeks feel like being stuck in an asylum where everything is written by the same one author, using the same words, same tropes, same idioms. i'm slowly going insane.
If any of the leading AI companies are looking to get back in the good graces of the public, they should seriously think about releasing an open source model that reliably labels media (text, photo or video) with a probability said media is AI generated.
There is a 0% chance they don’t already have models for this to prevent feeding their models AI generated training data. So release it.
Why spend the limited law enforcement budget on giving officers a cushy job of catching people for the crime of using a computer, when the same limited budget can be spent on catching those who actually hurt others?
But the article fails to take its statements to their logical conclusion, in one section, he writes,
> Every false positive means an innocent person's content was flagged — a family photo, a medical image, a piece of art. It means unnecessary investigation, potential harm to reputation, and erosion of trust in the system. At scale, even a 0.01% false positive rate means thousands of wrongful flags per day.
and, > In practice, the industry errs heavily toward minimizing false negatives — catching every possible match — and then uses human review to resolve false positives. This means the system flags aggressively but confirms carefully. The cost of a false positive is an investigation. The cost of a false negative is a child.
>
> This is also why the hybrid approach from Chapter VI matters. Perceptual hashing against a verified database has a low false positive rate — but not zero. Certain images (blank, solid-color, simple gradients) produce hashes that collide with database entries by coincidence, not because they depict abuse. Production systems include collision detection to filter these out before matching. Classifiers for unknown material have a higher false positive rate still (the model is making a judgment, not a comparison). By layering them — hashing first, then classifiers, then human review — the system can be both aggressive and precise. But no layer is perfect, and the threshold remains a human decision.
If there is a way to "include collision detection to filter these out before matching" then why do they "then human review?" The author starts the next section with, "Three Steps. No One Sees the Image."But they do human review to eliminate false positives? Both statements can't be simultaneously true - "no human ever sees it," or "by layering them — hashing first, then classifiers, then human review — the system can be both aggressive and precise."
Secondly, although I'm not a researcher, I think I and a lot of researchers would love to see this "aggressive, but precise algorithm" that eliminates collisions (an imprecise term - while here it means an image of a background or a setting that ticks off the similarity system; it's still not exactly a collision in the classical sense as the algorithm is a type of clustering with hashes) without making the algorithm useless? As far as I'm aware, no such algorithm exists without either becoming useless or having significant false positives. But I might be wrong.
At one point in the article, the author says, "The cost of a false negative is a child." This "aggressive and precise" system diverts resources from actual investigations and prosecution. A few examples,
A very famous case from 2022, https://www.nytimes.com/2022/08/21/technology/google-surveil...
A more precise example, as the author mentions PhotoDNA,
> LinkedIn found 75 accounts that were reported to EU authorities in the second half of 2021, due to files that it matched with known CSAM. But upon manual review, only 31 of those cases involved confirmed CSAM. (LinkedIn uses PhotoDNA, the software product specifically recommended by the U.S. sponsors of the EARN IT Bill.)
PhotoDNA's "aggressive and precise" have a 58.6% false positive rate when tested. That means nearly 60% of the cases it generates for investigations wasted investigators time, leading to fewer investigations overall.from, https://www.eff.org/deeplinks/2022/08/googles-scans-private-...
These systems are also flagging photos of adults,
> In the process of reporting images, the occurrence of false positives—instances where non-CSAM images are mistakenly reported as CSAM—is inevitable. *One officer told us that there are “a lot” of CyberTipline reports that are images of adults.124* More false positives will mean fewer cases going unreported, and platforms must decide what balance they are comfortable with. False positives and false negatives can be minimized with better detection technology. One respondent criticized platforms for relying on their in-house technology. They perceived those as inferior to solutions offered by start-ups, suggesting that this choice might be driven by profit motives.125 Platforms, however, might have reservations about using third-party services for screening potential CSAM due to legal and ethical considerations. An NGO employee highlighted platform concerns, asking, “Can we trust these organizations? What ethical due diligence have they done?”
via https://purl.stanford.edu/pr592kc5483The uncomfortable truth is that people are trying to use technology to fix a structural problem. Usually, most victims of CSA (including me) know the abuser. In my case and others, at least one adult knew (or suspected) and did nothing. More maddeningly, even when reported and the CSA is discovered and the perpetrator is punished, the victims are reabused within the foster care system. https://ballardbrief.byu.edu/issue-briefs/sexual-abuse-of-ch... 40% of children in foster care experience some type of abuse. Most never get the help they need.
I think the impulse to create systems to monitor everyone's phones for CSAM comes from a good place. But it's energy misdirected; better investigations into exploitation networks, investment in foster care and care for abused children and teens, heck even child AI companions capable of reporting abuse for children suspected of being abused would lead to better outcomes than scanning everyone's phone.
mystraline•1h ago
You WILL get a CSAM spam issue. It will get caught in your server cache. And you won't catch it until after the fact. And shit admin tools will not properly remove the spammer or content.
Better yet, if you run Matrix, disable image caching and preloading.