The Other Half of AI Safety

https://personalaisafety.com/p/the-other-half-of-ai-safety

34•sofiaqt•1h ago

Comments

simonw•55m ago

"There is no independent audit, no time series, no disclosed methodology, so we have no idea whether the real figure is higher, whether it is growing, or how it compares across the other frontier models, none of which publish equivalent data."

Tip for writers: aggressively filter out the "no X, no Y, no Z" pattern from your writing. Whether or not you used AI to help you write it's such a red flag now that you should be actively avoiding it in anything you publish.

falcor84•6m ago

Why is it a red flag?

How is it different from any other purely stylistic rules such as Strunk and White's prohibitions against split infinitives and the passive voice, which we've left far behind us? Why shouldn't people just write however feels natural to them as long as the message is clear?

simonw•4m ago

Because LLMs use it constantly, to the point that it sets my teeth on edge and instantly makes me question if reading the piece is worth my time.

wilg•54m ago

> Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human? This is one of many questions I can’t find concrete answers for.

I don't know if there are studies or concrete data either way, but it seems at least plausible that continuing the conversation could be more effective (read: saves more lives) than stopping it.

ngruhn•40m ago

The bad cases make headlines. But I think it's quite possible that AI is helping a lot of people in distress. Many people are uncomfortable opening up to humans, or have no one to talk to, or can't afford to fork over whatever-hourly-rate a therapist takes.

cyanydeez•36m ago

So how many bad cases are ok? Isn't this the same problem with social media: the commercial enterprises dont want any responsibility for their dark pattern and design choices which actively harm their users.

I get that all kinds of media can cause issues, but not all kinds of media are actively curated to be addictive.

wilg•7m ago

"How many cases are ok" (aka "zero tolerance") is a doomed to fail approach. Especially for a complex social problem's interaction with a complex new technology.

If you want to find out if ChatGPT is doing something wrong, there are many methodologies available: compare to other groups of people, statistical studies, etc.

I also think OpenAI's business model is pretty well aligned with the goal of users not killing themselves for like 100 reasons. And they do appear to take it seriously.

davorak•4m ago

Open ai and similar companies could open the doors to academic researchers to figure out the stats of help vs harm. It is not going to be a short term and perhaps not long term profit center though.

adampunk•38m ago

>Why is mental-health crisis not a gating category, the kind where the conversation stops, full stop, and the user is routed to a human?

there aren't enough humans.

altcognito•25m ago

I'll agree with this, but I think transparency about how often these situations arise and what they've done to mitigate is a legal necessity.

KolmogorovComp•5m ago

It’s also a free product for most.

Legend2440•31m ago

I don't buy that chatGPT is actually doing these users any harm.

I think openAI is doing the best they reasonably can with a very difficult class of users, whose problems are neither their fault nor within their power to fix.

stingraycharles•17m ago

I think this is the right take, and this is genuinely something that we as a society as a whole need to find a way to deal with.

I don’t know where AI is going to stand compared to the invention of, say, the Internet, but it’s going to cause a lot of change in society, in so many ways.

As always, it’s usually the people themselves that are the problem.

For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).

autoexec•1m ago

> For me, I’m personally more terrified what deepfakes and political manipulation / misinformation is going to do, combined with social media, and have a feeling that governments are completely unprepared to deal with this, as this will arrive fast (it’s already here somewhat).

I'm not convinced that deepfakes are any worse than photoshop was. It doesn't take much to manipulate/misinform someone. while you can use an AI generated video do to it, but simple text can be just as effective. The public needs to learn that they can't trust every video they see on the internet, just as they've had to learn that they can't trust every photo they see online. The threat with AI is how much faster it can push out the lies making what little moderation we have more difficult.

The best defense is making sure that people have a good education that teaches critical thinking skills and media literacy. We should also be holding social media platforms more accountable for the content they promote. It'd be nice if we held politicians and public servants accountable for spreading lies and misinformation too.

Turskarama•16m ago

Just because the users were already sick when they started using ChatGPT doesn't mean that ChatGPT isn't exacerbating the issue. Sickness isn't a boolean condition. A big problem with LLMs in general when it comes to people like this is that they are too sycophantic, they don't push back when you start acting strange and they're too gentle about trying to validate you.

BobbyJo•7m ago

It's hyper palatable food in the form of conversation. I see society treating it the same way eventually, at least along this one axis of interaction.

api•11m ago

If anything, my use of AI (admittedly not as a companion or a psychologist) suggests that it is on the whole significantly less toxic than the seething cess pit of social media.

AI is positively affirming by comparison.

autoexec•10m ago

> I don't buy that chatGPT is actually doing these users any harm.

I have zero doubt that chatgpt is doing users harm. I even give chatgpt a pass on giving vulnerable people, including children, instructions and information about how to kill themselves. One place chatgpt goes over the line is actively encouraging them to go through with suicide.

I also don't doubt that it feeds into mania and psychosis. While almost anything can do the same, they've designed the service to be as addictive and engaging as possible in part by turning up the ass-kissing sycophancy to 11 with total disregard for the fact that there are times when it's very dangerous to encourage and support everything someone says no matter how obviously sick they are. They also want to whore themselves out as a virtual therapist while being unfit and unqualified for the job and that's just one of many roles the chatbot isn't fit for but they're happy to let you try anyway.

SilverElfin•5m ago

If it wasn’t ChatGPT but a fiction book, would you feel the author is “doing harm”? Or is the reader doing it to themselves?

davorak•6m ago

> I don't buy that chatGPT is actually doing these users any harm.

For me to buy this as true I would expect that those people would be as well off or as bad off if chatGPT was in their life or not.

I expect that some people are worse off with chatGPT in their life.

Responsibility for that harm is a different question though. Some people are also better of without cars in their life and we let the government laws sort that out.

Getting openAI and similar companies to act in mitigating these harms serves at least a few purposes; reducing the overall harm in the world, reducing/limiting future government regulation, maximizing the adoption of ai tools, potentially increasing long term profits of the companies in question.

cm2012•4m ago

1000% agreed. ChatGPT is way better than the alternative of not having it

ianbutler•26m ago

OpenAI has 900 million weekly active users. So around 0.01% are having problems. That's actually way less than population level measures for the same symptoms on a bigger percentage of people relative to the US on just suicidal ideation alone.

https://www.cdc.gov/mmwr/volumes/74/wr/mm7412a4.htm

vkou•12m ago

I'm pretty sure that ~100% of those 700 million people will have a bad, utterly dehumanizing experience when they will next be looking for a job, because OpenAI is heavily used by HR.

That's the problem with AI safety. Not in voluntary usage, but in involuntary usage, where someone with power over you will use it against you, it does something incredibly stupid and you have no recourse, no appeal, no awareness of what you did wrong - or if you even did anything wrong.

And it's not just employment. Governments, vendors, retailers, landlords, utilities are, or will all be using it in situations that will dramatically impact your life.

adamnemecek•15m ago

Autodiff is preventing any meaningful discussion about safety, systems trained with autodiff cannot be made safe.

timf34•13m ago

I sympathize with the piece, evaluating how LLMs interact with mentally vulnerable users is something I've been actively working on: https://vigil-eval.com/

The biggest observation so far is that the latest models are night and day from LLMs from even 6 months ago (from OpenAI + Anthropic, Google is still very poor!)

The inventor hoping to fix your washing machine to stop microplastics

China moves to regulate digital humans - Reuters

Dungeons & Desktops: Building a Roguelike with GitHub Copilot CLI

Dungeons & Desktops: 10 open source roguelikes that never die

Dmitry.gr: Projects

delta time

Vibe, A single-header lock-free networking library for Linux

Avoiding and reducing microplastic false positives from dry glove contact

Show HN: We rebuilt the archived Kubernetes Dashboard in React 19 and Go

The Original 1965 Gatorade Recipe

Automating FPGA-Based Network Switches with Protocol Adaptive Customization

SQLite Code of Ethics

Project Wycheproof tests crypto libraries against known attacks

Show HN: AirScore – Daily air-quality emails tailored to household conditions

Show HN: Claude-pee: use Claude -p without the programmatic usage credit pool

Microbial Dark Matter and the Search for Life on Earth

Mystery Microsoft bug leaker keeps the zero-days coming

Show HN: Grabbit: Search secondhand marketplaces in one place

Short-Term Dietary Intervention Alters Physiological Profiles Relevant to Ageing

Claude -p headless mode cannot use Max limits, will fall under API plan

Israeli Tech Exposes Users of Musk's Starlink Satellite-Based Internet

Show HN: Abliteration – made-to-order training data for classifiers and evals

I spent months fighting VS Code webviews, so I built a universal protocol

Scorched Earth 2000 is back

PSF Welcomes Hudson River Trading (HRT) as a Visionary Sponsor

126 Chrome extensions collected WhatsApp data through undisclosed servers

Trump's Disappearing China Hawks

Taking Control of the SQLite WAL

What Is Code?

Toyota built a $10B private utopia–what's going on in there?