The Other Half of AI Safety

https://personalaisafety.com/p/the-other-half-of-ai-safety

34•sofiaqt•1h ago

Comments

simonw•55m ago

"There is no independent audit, no time series, no disclosed methodology, so we have no idea whether the real figure is higher, whether it is growing, or how it compares across the other frontier models, none of which publish equivalent data."

Tip for writers: aggressively filter out the "no X, no Y, no Z" pattern from your writing. Whether or not you used AI to help you write it's such a red flag now that you should be actively avoiding it in anything you publish.

falcor84•6m ago

Why is it a red flag?

How is it different from any other purely stylistic rules such as Strunk and White's prohibitions against split infinitives and the passive voice, which we've left far behind us? Why shouldn't people just write however feels natural to them as long as the message is clear?

simonw•4m ago

Because LLMs use it constantly, to the point that it sets my teeth on edge and instantly makes me question if reading the piece is worth my time.

wilg•54m ago

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human? This is one of many questions I can’t find concrete answers for.

I don't know if there are studies or concrete data either way, but it seems at least plausible that continuing the conversation could be more effective (read: saves more lives) than stopping it.

ngruhn•40m ago

The bad cases make headlines. But I think it's quite possible that AI is helping a lot of people in distress. Many people are uncomfortable opening up to humans, or have no one to talk to, or can't afford to fork over whatever-hourly-rate a therapist takes.

cyanydeez•36m ago

So how many bad cases are ok? Isn't this the same problem with social media: the commercial enterprises dont want any responsibility for their dark pattern and design choices which actively harm their users.

I get that all kinds of media can cause issues, but not all kinds of media are actively curated to be addictive.

wilg•7m ago

"How many cases are ok" (aka "zero tolerance") is a doomed to fail approach. Especially for a complex social problem's interaction with a complex new technology.

If you want to find out if ChatGPT is doing something wrong, there are many methodologies available: compare to other groups of people, statistical studies, etc.

I also think OpenAI's business model is pretty well aligned with the goal of users not killing themselves for like 100 reasons. And they do appear to take it seriously.

davorak•5m ago

Open ai and similar companies could open the doors to academic researchers to figure out the stats of help vs harm. It is not going to be a short term and perhaps not long term profit center though.

adampunk•38m ago

>Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

there aren't enough humans.

altcognito•26m ago

I'll agree with this, but I think transparency about how often these situations arise and what they've done to mitigate is a legal necessity.

KolmogorovComp•5m ago

It’s also a free product for most.

Legend2440•31m ago

I don't buy that chatGPT is actually doing these users any harm.

I think openAI is doing the best they reasonably can with a very difficult class of users, whose problems are neither their fault nor within their power to fix.

stingraycharles•18m ago

I think this is the right take, and this is genuinely something that we as a society as a whole need to find a way to deal with.

I don’t know where AI is going to stand compared to the invention of, say, the Internet, but it’s going to cause a lot of change in society, in so many ways.

As always, it’s usually the people themselves that are the problem.

For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).

autoexec•2m ago

> For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).

I'm not convinced that deepfakes are any worse than photoshop was. It doesn't take much to manipulate/misinform someone. while you can use an AI generated video do to it, but simple text can be just as effective. The public needs to learn that they can't trust every video they see on the internet, just as they've had to learn that they can't trust every photo they see online. The threat with AI is how much faster it can push out the lies making what little moderation we have more difficult.

The best defense is making sure that people have a good education that teaches critical thinking skills and media literacy. We should also be holding social media platforms more accountable for the content they promote. It'd be nice if we held politicians and public servants accountable for spreading lies and misinformation too.

Turskarama•16m ago

Just because the users were already sick when they started using ChatGPT doesn't mean that ChatGPT isn't exacerbating the issue. Sickness isn't a boolean condition. A big problem with LLMs in general when it comes to people like this is that they are too sycophantic, they don't push back when you start acting strange and they're too gentle about trying to validate you.

BobbyJo•8m ago

It's hyper palatable food in the form of conversation. I see society treating it the same way eventually, at least along this one axis of interaction.

api•11m ago

If anything, my use of AI (admittedly not as a companion or a psychologist) suggests that it is on the whole significantly less toxic than the seething cess pit of social media.

AI is positively affirming by comparison.

autoexec•10m ago

> I don't buy that chatGPT is actually doing these users any harm.

I have zero doubt that chatgpt is doing users harm. I even give chatgpt a pass on giving vulnerable people, including children, instructions and information about how to kill themselves. One place chatgpt goes over the line is actively encouraging them to go through with suicide.

I also don't doubt that it feeds into mania and psychosis. While almost anything can do the same, they've designed the service to be as addictive and engaging as possible in part by turning up the ass-kissing sycophancy to 11 with total disregard for the fact that there are times when it's very dangerous to encourage and support everything someone says no matter how obviously sick they are. They also want to whore themselves out as a virtual therapist while being unfit and unqualified for the job and that's just one of many roles the chatbot isn't fit for but they're happy to let you try anyway.

SilverElfin•5m ago

If it wasn’t ChatGPT but a fiction book, would you feel the author is “doing harm”? Or is the reader doing it to themselves?

davorak•6m ago

> I don't buy that chatGPT is actually doing these users any harm.

For me to buy this as true I would expect that those people would be as well off or as bad off if chatGPT was in their life or not.

I expect that some people are worse off with chatGPT in their life.

Responsibility for that harm is a different question though. Some people are also better of without cars in their life and we let the government laws sort that out.

Getting openAI and similar companies to act in mitigating these harms serves at least a few purposes; reducing the overall harm in the world, reducing/limiting future government regulation, maximizing the adoption of ai tools, potentially increasing long term profits of the companies in question.

cm2012•5m ago

1000% agreed. ChatGPT is way better than the alternative of not having it

ianbutler•26m ago

OpenAI has 900 million weekly active users. So around 0.01% are having problems. That's actually way less than population level measures for the same symptoms on a bigger percentage of people relative to the US on just suicidal ideation alone.

https://www.cdc.gov/mmwr/volumes/74/wr/mm7412a4.htm

vkou•13m ago

I'm pretty sure that ~100% of those 700 million people will have a bad, utterly dehumanizing experience when they will next be looking for a job, because OpenAI is heavily used by HR.

That's the problem with AI safety. Not in voluntary usage, but in involuntary usage, where someone with power over you will use it against you, it does something incredibly stupid and you have no recourse, no appeal, no awareness of what you did wrong - or if you even did anything wrong.

And it's not just employment. Governments, vendors, retailers, landlords, utilities are, or will all be using it in situations that will dramatically impact your life.

adamnemecek•16m ago

Autodiff is preventing any meaningful discussion about safety, systems trained with autodiff cannot be made safe.

timf34•13m ago

I sympathize with the piece, evaluating how LLMs interact with mentally vulnerable users is something I've been actively working on: https://vigil-eval.com/

The biggest observation so far is that the latest models are night and day from LLMs from even 6 months ago (from OpenAI + Anthropic, Google is still very poor!)

The Other Half of AI Safety

Linux gaming is faster because Windows APIs are becoming Linux kernel features

Setting up a free *.city.state.us locality domain (2025)

A History of IDEs at Google

Scorched Earth 2000 is back

Marco Polo: Finding a friend with only distance and motion

Chess puzzle I found in my dad's old book

The Emacsification of Software

Princeton mandates proctoring for in-person exams, upending 133 year precedent

Xs of Y – roguelike that names itself every run. Written in 4kLoC

Twin brothers wipe 96 government databases minutes after being fired

Launch HN: Ardent (YC P26) – Postgres sandboxes in seconds with zero migration

The US is winning the AI race where it matters most: commercialization

S-100 Virtual Workbench

Reverting the incremental GC in Python 3.14 and 3.15

A sentimental tour of late 1990s and early 2000s hacking tools

Tell HN: Dont use Claude Design, lost access to my projects after unsubscribing

After 3 decades of splendid scientific communication, this one's for you, Ned

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

An idiot's guide to lead optimisation for proteins

Meta won't let you block its AI account on Threads

Leaving GitHub for Forgejo

Preserving Fisher-Price Pixter

The Age of the Amplifier

I moved my digital stack to Europe

Making the news available at no cost is a victory

Comparing a 1980s memory map to the Raspi Pico

Medicare's new payment model is built for AI. Most of the tech world has no idea

Substrate (YC S24) Is Hiring a Technical Success Manager

Exploring 8 Shaft Weaving

The Other Half of AI Safety

Comments

The Other Half of AI Safety

Linux gaming is faster because Windows APIs are becoming Linux kernel features

Setting up a free *.city.state.us locality domain (2025)

A History of IDEs at Google

Scorched Earth 2000 is back

Marco Polo: Finding a friend with only distance and motion

Chess puzzle I found in my dad's old book

The Emacsification of Software

Princeton mandates proctoring for in-person exams, upending 133 year precedent

Xs of Y – roguelike that names itself every run. Written in 4kLoC

Twin brothers wipe 96 government databases minutes after being fired

Launch HN: Ardent (YC P26) – Postgres sandboxes in seconds with zero migration

The US is winning the AI race where it matters most: commercialization

S-100 Virtual Workbench

Reverting the incremental GC in Python 3.14 and 3.15

A sentimental tour of late 1990s and early 2000s hacking tools

Tell HN: Dont use Claude Design, lost access to my projects after unsubscribing

After 3 decades of splendid scientific communication, this one's for you, Ned

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

An idiot's guide to lead optimisation for proteins

Meta won't let you block its AI account on Threads

Leaving GitHub for Forgejo

Preserving Fisher-Price Pixter

The Age of the Amplifier

I moved my digital stack to Europe

Making the news available at no cost is a victory

Comparing a 1980s memory map to the Raspi Pico

Medicare's new payment model is built for AI. Most of the tech world has no idea

Substrate (YC S24) Is Hiring a Technical Success Manager

Exploring 8 Shaft Weaving