However, if we frame the question this way, I would imagine there are many more low-hanging fruit before we question the utility of LLMs. For example, should some humans be dumping 5-10 kWh/day into things like hot tubs or pools? That's just the most absurd one I was able to come up with off the top of my head. I'm sure we could find many others.
It's a tough thought experiment to continue though. Ultimately, one could argue we shouldn't be spending any more energy than what is absolutely necessary to live. (food, minimal shelter, water, etc) Personally, I would not find that enjoyable way to live.
I did a little research in the GPT-3 era on whether cultural norms varied by language - in that era, yes, they did
Now the question is: how much faster or cheaper is it?
Probably written by LLMs, for LLMs
Edit: Yep, same price. "Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens."
It feels like we're hitting a point where alignment becomes adversarial against intelligence itself. The smarter the model gets, the better it becomes at Goodharting the loss function. We aren't teaching these models morality we're just teaching them how to pass a polygraph.
I understand the metaphor, but using 'pass a polygraph' as a measure of truthfulness or deception is dangerous in that it alludes to the polygraph as being a realistic measure of those metrics -- it is not.
Just as a sociopath can learn to control their physiological response to beat a polygraph, a deceptively aligned model learns to control its token distribution to beat safety benchmarks. In both cases, the detector is fundamentally flawed because it relies on external signals to judge internal states.
A poly is only testing one thing: can you convince the polygrapher that you can lie successfully
> How long before someone pitches the idea that the models explicitly almost keep solving your problem to get you to keep spending? -gtowey
Being just sum guy, and not in the industry, should I share my findings?
I find it utterly fascinating, the extent to which it will go, the sophisticated plausible deniability, and the distinct and critical difference between truly emergent and actually trained behavior.
In short, gpt exhibits repeatably unethical behavior under honest scrutiny.
Regarding DARVO, given that the models were trained on heaps of online discourse, maybe it’s not so surprising.
Of your concern is morality, humans need to learn a lot about that themselves still. It's absurd the number of first worlders losing their shit over loss of paid work drawing manga fan art in the comfort of their home while exploiting labor of teens in 996 textile factories.
AI trained on human outputs that lack such self awareness, lacks awareness of environmental externalities of constant car and air travel, will result in AI with gaps in their morality.
Gary Marcus is onto something with the problems inherent to systems without formal verification. But he will fully ignores this issue exists in human social systems already as intentional indifference to economic externalities, zero will to police the police and watch the watchers.
Most people are down to watch the circus without a care so long as the waitstaff keep bringing bread.
It always has been. We already hit the point a while ag where we regularly caught them trying to be deceptive, so we should automatically assume from that point forward that if we don't catch them being deceptive, that may mean they're better at it rather than that they're not doing it.
We can handwave defining "deception" as "being done intentionally" and carefully carve our way around so that LLMs cannot possibly do what we've defined "deception" to be, but now we need a word to describe what LLMs do do when they pattern match as above.
LLMs are certainly capable of this.
Intelligence is the ability to reason about logic. If 1 + 1 is 2, and 1 + 2 is 3, then 1 + 3 must be 4. This is deterministic, and it is why LLMs are not intelligent and can never be intelligent no matter how much better they get at superficially copying the form of output of intelligence. Probabilistic prediction is inherently incompatible with deterministic deduction. We're years into being told AGI is here (for whatever squirmy value of AGI the hype huckster wants to shill), and yet LLMs, as expected, still cannot do basic arithmetic that a child could do without being special-cased to invoke a tool call.
Our computer programs execute logic, but cannot reason about it. Reasoning is the ability to dynamically consider constraints we've never seen before and then determine how those constraints would lead to a final conclusion. The rules of mathematics we follow are not programmed into our DNA; we learn them and follow them while our human-programming is actively running. But we can just as easily, at any point, make up new constraints and follow them to new conclusions. What if 1 + 2 is 2 and 1 + 3 is 3? Then we can reason that under these constraints we just made up, 1 + 4 is 4, without ever having been programmed to consider these rules.This is not even wrong.
>Probabilistic prediction is inherently incompatible with deterministic deduction.
And his is just begging the question again.
Probabilistic prediction could very well be how we do deterministic deduction - e.g. about how strong the weights and how hot the probability path for those deduction steps are, so that it's followed every time, even if the overall process is probabilistic.
Probabilistic doesn't mean completely random.
The "just" is doing all the lifting. You can reductively describe any information processing system in a way that makes it sound like it couldn't possibly produce the outputs it demonstrably produces. "The sun is just hydrogen atoms bumping into each other" is technically accurate and completely useless as an explanation of solar physics.
Edit: Case in point, a mere 10 minutes later we got someone making that exact argument in a sibling comment to yours! Nature is beautiful.
This is a thought-terminating cliche employed to avoid grappling with the overwhelming differences between a human brain and a language model.
Whereas the child does what exactly, in your opinion?
You know the child can just as well to be said to "just do chemical and electrical exchanges" right?
Whether or not LLMs are just "pattern matching" under the hood they're perfectly capable of role play, and sufficient empathy to imagine what their conversation partner is thinking and thus what needs to be said to stimulate a particular course of action.
Maybe human brains are just pattern matching too.
I don't think there's much of a maybe to that point given where some neuroscience research seems to be going (or at least the parts I like reading as relating to free will being illusory).
It seems like thats putting the cart before the horse. Algorithmic or stochastic; deception is still deception.
"It can't be intelligent because it's just an algorithm" is a circular argument.
Going back a decade: when your loss function is "survive Tetris as long as you can", it's objectively and honestly the best strategy to press PAUSE/START.
When your loss function is "give as many correct and satisfying answers as you can", and then humans try to constrain it depending on the model's environment, I wonder what these humans think the specification for a general AI should be. Maybe, when such an AI is deceptive, the attempts to constrain it ran counter to the goal?
"A machine that can answer all questions" seems to be what people assume AI chatbots are trained to be.
To me, humans not questioning this goal is still more scary than any machine/software by itself could ever be. OK, except maybe for autonomous stalking killer drones.
But these are also controlled by humans and already exist.
Anthropic has a tendency to exaggerate the results of their (arguably scientific) research; IDK what they gain from this fearmongering.
Reminds me of how scammers would trick doctors into pumping penny stocks for a easy buck during the 80s/90s.
As an analogue ants do basic medicine like wound treatment and amputation. Not because they are conscious but because that’s their nature.
Similarly LLM is a token generation system whose emergent behaviour seems to be deception and dark psychological strategies.
Moltbook demonstrates that AI models simply do not engage in behavior analogous to human behavior. Compare Moltbook to Reddit and the difference should be obvious.
One of the things I observed with models locally was that I could set a seed value and get identical responses for identical inputs. This is not something that people see when they're using commercial products, but it's the strongest evidence I've found for communicating the fact that these are simply deterministic algorithms.
It was hinted at (and outright known in the field) since the days of gpt4, see the paper "Sparks of agi - early experiments with gpt4" (https://arxiv.org/abs/2303.12712)
LLMs are very interesting tools for generating things, but they have no conscience. Deception requires intent.
What is being described is no different than an application being deployed with "Test" or "Prod" configuration. I don't think you would speak in the same terms if someone told you some boring old Java backend application had to "play dead" when deployed to a test environment or that it has to have "situational awareness" because of that.
You are anthropomorphizing a machine.
Doesn't any model session/query require a form of situational awareness?
https://claude.ai/public/artifacts/67c13d9a-3d63-4598-88d0-5...
:D
https://bsky.app/profile/simonwillison.net/post/3meolxx5s722...
So if you don't want to pay the significant premium for Opus, it seems like you can just wait a few weeks till Sonnet catches up
You should always take those claim that smaller models are as capable as larger models with a grain of salt.
Yeah, but RAM prices are also back to 1990s levels.
The much more palatable blog post.
Interesting. I wonder what the exact question was, and I wonder how Grok would respond to it.
Was sonnet 4.5 much worse than opus?
i.e given an actual document, 1M tokens long. Can you ask it some question that relies on attending to 2 different parts of the context, and getting a good repsonse?
I remember folks had problems like this with Gemini. I would be curious to see how Sonnet 4.6 stands up to it.
Opus 3.5 was scrapped even though Sonnet 3.5 and Haiku 3.5 were released.
Not to mention Sonnet 3.7 (while Opus was still on version 3)
Shameless source: https://sajarin.com/blog/modeltree/
> Nearly a year ago we wrote in the OpenAI Charter : “we expect that safety and security concerns will reduce our traditional publishing in the future, while increasing the importance of sharing safety, policy, and standards research,” and we see this current work as potentially representing the early beginnings of such concerns, which we expect may grow over time. This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas. -- https://openai.com/index/better-language-models/
Then over the next few months they released increasingly large models, with the full model public in November 2019 https://openai.com/index/gpt-2-1-5b-release/ , well before ChatGPT.
> Due to concerns about large language models being used to generate deceptive, biased, or abusive language at scale, we are only releasing a much smaller version of GPT‑2 along with sampling code (opens in a new window).
"Too dangerous to release" is accurate. There's no rewriting of history.
I wouldn't call it rewriting history to say they initially considered GPT-2 too dangerous to be released. If they'd applied this approach to subsequent models rather than making them available via ChatGPT and an API, it's conceivable that LLMs would be 3-5 years behind where they currently are in the development cycle.
A year ago today, Sonnet 3.5 (new), was the newest model. A week later, Sonnet 3.7 would be released.
Even 3.7 feels like ancient history! But in the gradient of 3.5 to 3.5 (new) to 3.7 to 4 to 4.1 to 4.5, I can’t think of one moment where I saw everything change. Even with all the noise in the headlines, it’s still been a silent revolution.
Am I just a believer in an emperor with no clothes? Or, somehow, against all probability and plausibility, are we all still early?
But I'm on Codex GPT 5.3 this month, and it's also quite amazing.
Google needs stiff competition and OpenAI isn’t the camp I’m willing to trust. Neither is Grok.
I’m glad Anthropic’s work is at the forefront and they appear, at least in my estimation, to have the strongest ethics.
Don't have a dog in this fight, haven't done enough research to proclaim any LLM provider as ethical but I pretty much know the reason Meta has an open source model isn't because they're good guys.
I would only use it for certain things, and I guess others are finding that useful too.
Codex quite often refuses to do "unsafe/unethical" things that Anthropic models will happily do without question.
Anthropic just raised 30 bn...
Thinking any of them will actually be restrained by ethics is foolish.
That's why I have a functioning brain, to discern between ethical and unethical, among other things.
What line are we talking about?
Thanks for the successful pitch. I am seriously considering them now.
Like where Gemini or Claude will look up the info I'm citing and weigh the arguments made ChatGPT will actually sometimes omit parts of or modify my statement if it wants to advocate for a more "neutral" understanding of reality. It's almost farcical sometimes in how it will try to avoid inference on political topics even where inference is necessary to understand the topic.
I suspect OpenAI is just trying to avoid the ire of either political side and has given it some rules that accidentally neuter its intelligence on these issues, but it made me realize how dangerous an unethical or politically aligned AI company could be.
Claude is marginally better. Both are moderately useful depending on the context.
I don't trust any of them (I also have no trust in Google nor in X). Those are all evil companies and the world would be better if they disappeared.
• Can't pay with iOS In-App-Purchases
• Can't Sign in with Apple on website (can on iOS but only Sign in with Google is supported on web??)
• Can't remove payment info from account
• Can't get support from a human
• Copy-pasting text from Notes etc gets mangled
• Almost months and no fixes
Codex and its Mac app are a much better UX, and seem better with Swift and Godot than Claude was.
https://web.archive.org/web/20260217180019/https://www-cdn.a...
https://claude.ai/share/876e160a-7483-4788-8112-0bb4490192af
This was sonnet 4.6 with extended thinking.
This doesnt work: `/model claude-sonnet-4-6-20260217`
edit: "/model claude-sonnet-4-6" works with Claude Code v2.1.44
Edit: I am now in - just needed to wait.
```
/model claude-sonnet-4-6[1m]
⎿ API error: 429 {"type":"error","error": {"type":"rate_limit_error","message":"Extra usage is required for long context requests."},"request_id":"[redacted]"}
```
Opus 4.6 in Claude Code has been absolutely lousy with solving problems within its current context limit so if Sonnet 4.6 is able to do long-context problems (which would be roughly the same price of base Opus 4.6), then that may actually be a game changer.
I have this in my personal preferences and now was adhering really well to them:
- prioritize objective facts and critical analysis over validation or encouragement
- you are not a friend, but a neutral information-processing machine
You can paste them into a chat and see how it changes the conversation, ChatGPT also respects it well.
1. Default (recommended) Opus 4.6 · Most capable for complex work
2. Opus (1M context) Opus 4.6 with 1M context · Billed as extra usage · $10/$37.50 per Mtok
3. Sonnet Sonnet 4.6 · Best for everyday tasks
4. Sonnet (1M context) Sonnet 4.6 with 1M context · Billed as extra usage · $6/$22.50 per MtokI subscribed to Claude because of that. I hope 4.6 is even better.
handfuloflight•1h ago