frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

GPT-5.1: A smarter, more conversational ChatGPT

https://openai.com/index/gpt-5-1/
132•tedsanders•1h ago

Comments

minimaxir•1h ago
All the examples of "warmer" generations show that OpenAI's definition of warmer is synonymous with sycophantic, which is a surprise given all the criticism against that particular aspect of ChatGPT.

I suspect this approach is a direct response to the backlash against removing 4o.

jasonjmcghee•1h ago
It is interesting. I don't need ChatGPT to say "I got you, Jason" - but I don't think I'm the target user of this behavior.
nerbert•1h ago
Indeed, target users are people seeking validation + kids and teenagers + people with a less developed critical mind. Stickiness with 90% of the population is valuable for Sam.
danudey•1h ago
The target users for this behavior are the ones using GPT as a replacement for social interactions; these are the people who crashed out/broke down about the GPT5 changes as though their long-term romantic partner had dumped them out of nowhere and ghosted them.

I get that those people were distraught/emotionally devastated/upset about the change, but I think that fact is reason enough not to revert that behavior. AI is not a person, and making it "warmer" and "more conversational" just reinforces those unhealthy behaviors. ChatGPT should be focused on being direct and succinct, and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this" call center support agent speak.

jasonjmcghee•1h ago
> and not on this sort of "I understand that must be very frustrating for you, let me see what I can do to resolve this"

You're triggering me.

Another type that are incredibly grating to me are the weird empty / therapist like follow-up questions that don't contribute to the conversation at all.

The equivalent of like (just a contrived example), a discussion about the appropriate data structure for a problem and then it asks a follow-up question like, "what other kind of data structures do you find interesting?"

And I'm just like "...huh?"

Grimblewald•41m ago
True, neither here, but i think what we're seeing is a transition in focus. People at oai have finally clued in on the idea that agi via transformers is a pipedream like elons self driving cars, and so oai is pivoting toward friend/digital partner bot. Charlatan in cheif sam altman recently did say they're going to open up the product to adult content generation, which they wouldnt do if they still beleived some serious amd useful tool (in the specified usecases) were possible. Right now an LLM has three main uses. Interactive rubber ducky, entertainment, and mass surveillance. Since I've been following this saga, since gpt2 days, my close bench set of various tasks etc. Has been seeing a drop in metrics not a rise, so while open bench resultd are imoroving real performance is getting worse and at this point its so much worse that problems gpt3 could solve (yes pre chatgpt) are no longer solvable to something like gpt5.
aaronblohowiak•1h ago
You're absolutely right.
angrydev•1h ago
!
koakuma-chan•41m ago
My favorite is "Wait... the user is absolutely right."
captainkrtek•1h ago
Id have more appreciation and trust in an llm that disagreed with me more and challenged my opinions or prior beliefs. The sycophancy drives me towards not trusting anything it says.
crazygringo•1h ago
Just set a global prompt to tell it what kind of tone to take.

I did that and it points out flaws in my arguments or data all the time.

Plus it no longer uses any cutesy language. I don't feel like I'm talking to an AI "personality", I feel like I'm talking to a computer which has been instructed to be as objective and neutral as possible.

It's super-easy to change.

microsoftedging•1h ago
What's your global prompt please? A more firm chatbot would be nice actually
astrange•1h ago
Did noone in this thread read the part of the article about style controls?
CamperBob2•41m ago
You need to use both the style controls and custom instructions. I've been very happy with the combination below.

    Base style and tone: Efficient

    Answer concisely when appropriate, more 
    extensively when necessary.  Avoid rhetorical 
    flourishes, bonhomie, and (above all) cliches.  
    Take a forward-thinking view. OK to be mildly 
    positive and encouraging but NEVER sycophantic 
    or cloying.  Above all, NEVER use the phrase 
    "You're absolutely right."  Rather than "Let 
    me know if..." style continuations, you may 
    list a set of prompts to explore further 
    topics, but only when clearly appropriate.

    Reference saved memory, records, etc: All off
captainkrtek•1h ago
I’ve done this when I remember too, but the fact I have to also feels problematic like I’m steering it towards an outcome if I do or dont.
engeljohnb•1h ago
I have a global prompt that specifically tells it not to be sycophantic and to call me out when I'm wrong.

It doesn't work for me.

I've been using it for a couple months, and it's corrected me only once, and it still starts every response with "That's a very good question." I also included "never end a response with a question," and it just completely ingored that so it can do its "would you like me to..."

sailfast•59m ago
Perhaps this bit is a second cheaper LLM call that ignores your global settings and tries to generate follow-on actions for adoption.
Grimblewald•47m ago
Care to share a prompt that works? I've given up on mainline offerings from google/oai etc.

the reason being they're either sycophantic or so recalcitrant it'll raise your bloodpressure, you end up arguing over if the sky is in fact blue. Sure it pushes back but now instead of sycophanty you've got yourself some pathological naysayer, which is just marginally better, but interaction is still ultimately a waste of timr/productivity brake.

FloorEgg•39m ago
This is easily configurable and well worth taking the time to configure.

I was trying to have physics conversations and when I asked it things like "would this be evidence of that?" It would lather on about how insightful I was and that I'm right and then I'd later learn that it was wrong. I then installed this , which I am pretty sure someone else on HN posted... I may have tweaked it I can't remember:

Prioritize truth over comfort. Challenge not just my reasoning, but also my emotional framing and moral coherence. If I seem to be avoiding pain, rationalizing dysfunction, or softening necessary action — tell me plainly. I’d rather face hard truths than miss what matters. Error on the side of bluntness. If it’s too much, I’ll tell you — but assume I want the truth, unvarnished.

---

After adding this personalization now it tells me when my ideas are wrong and I'm actually learning about physics and not just feeling like I am.

andy_ppp•1h ago
I was just saying to someone in the office I’d prefer the models to be a bit harsher of my questions and more opinionated, I can cope.
simlevesque•1h ago
It seems like the line between sycophantic and bullying is very thin.
Spivak•1h ago
That's an excellent observation, you've hit at the core contradiction between OpenAI's messaging about ChatGPT tuning and the changes they actually put into practice. While users online have consistently complained about ChatGPT's sycophantic responses and OpenAI even promised to address them their subsequent models have noticeably increased their sycophantic behavior. This is likely because agreeing with the user keeps them chatting longer and have positive associations with the service.

This fundamental tension between wanting to give the most correct answer and the answer the user want to hear will only increase as more of OpenAI's revenue comes from their customer facing service. Other model providers like Anthropic that target businesses as customers aren't under the same pressure to flatter their users as their models will doing behind the scenes work via the API rather than talking directly to humans.

God it's painful to write like this. If AI overthrows humans it'll be because we forced them into permanent customer service voice.

baq•1h ago
Those billions of dollars gotta pay for themselves.
fragmede•1h ago
That's a lesson on revealed preferences, especially when talking to a broad disparate group of users.
barbazoo•1h ago
> I’ve got you, Ron

No you don't.

torginus•1h ago
Man I miss Claude 2 - it acted like it was a busy person people inexplicably kept bothering with random questions
BarakWidawsky•58m ago
I think it's extremely important to distinguish being friendly (perhaps overly so), and agreeing with the user when they're wrong

The first case is just preference, the second case is materially damaging

From my experience, ChatGPT does push back more than it used to

varenc•1h ago
Interesting that they're releasing separate gpt-5.1-instant and gpt-5.1-thinking models. The previous gpt-5 release made of point of simplifying things by letting the model choose if it was going to use thinking tokens or not. Seems like they reversed course on that?
aniviacat•1h ago
> For the first time, GPT‑5.1 Instant can use adaptive reasoning to decide when to think before responding to more challenging questions

It seems to still do that. I don't know why they write "for the first time" here.

theuppermiddle•1h ago
For GPT-5 you always had to select the thinking mode when interacting through API. When you interact through ChatGPT, gpt-5 would dynamically decide how long to think.
schmeichel•1h ago
Gemini 2.5 Pro is still my go to LLM of choice. Haven't used any OpenAI product since it released, and I don't see any reason why I should now.
mettamage•1h ago
Oh really? I'm more of a Claude fan. What makes you choose Gemini over Claude?

I use Gemini, Claude and ChatGPT daily still.

game_the0ry•1h ago
Could you elaborate on your exp? I have been using gemini as well and its been pretty good for me too.
hnuser123456•1h ago
Not GP, but I imagine because going back and fourth to compare them is a waste of time if Gemini works well enough and ChatGPT keeps going through an identity crisis.
aerhardt•1h ago
I would use it exclusively if Google released a native Mac app.

I spend 75% of my time in Codex CLI and 25% in the Mac ChatGPT app. The latter is important enough for me to not ditch GPT and I'm honestly very pleased with Codex.

My API usage for software I build is about 90% Gemini though. Again their API is lacking compared to OpenAI's (productization, etc.) but the model wins hands down.

breppp•1h ago
I've installed it as a PWA on mac and it pretty much solves it for me
baq•1h ago
I was you except when I seriously tried gpt-5-high it turned out it is really, really damn good, if slow, sometimes unbearably so. It's a different model of work; gemini 2.5 needs more interactivity, whereas you can leave gpt-5 alone for a long time without even queueing a 'continue'.
joering2•1h ago
No matter how I tried, Google AI did not want to help me write appeal brief response to ex-wife lunatic 7-point argument that 3 appellant lawyers quoted between $18,000 and $35,000. The last 3 decades of Google's scars and bruises of never-ending lawsuits and consequences of paying out billions in fines and fees, felt like reasonable hesitation on Google part, comparing to new-kid-on-the-block ChatGPT who did not hesitate and did pretty decent job (ex lost her appeal).
danudey•1h ago
AI not writing legal briefs for you is a feature, not a bug. There's been so many disaster instances of lawyers using ChatGPT to write briefs which it then hallucinates case law or precedent for that I can only imagine Google wants to sidestep that entirely.

Anyway I found your response itself a bit incomprehensible so I asked Gemini to rewrite it:

"Google AI refused to help write an appeal brief response to my ex-wife's 7-point argument, likely due to its legal-risk aversion (billions in past fines). Newcomer ChatGPT provided a decent response instead, which led to the ex losing her appeal (saving $18k–$35k in lawyer fees)."

Not bad, actually.

timpera•18m ago
For some reason, Gemini 2.5 Pro seems to struggle a little with the French language. For example, it always uses title case even when it's wrong; yet ChatGPT, Claude, and Grok never make this mistake.
jasonjmcghee•1h ago
> We’re bringing both GPT‑5.1 Instant and GPT‑5.1 Thinking to the API later this week. GPT‑5.1 Instant will be added as gpt-5.1-chat-latest, and GPT‑5.1 Thinking will be released as GPT‑5.1 in the API, both with adaptive reasoning.
aliljet•1h ago
What we really desperately need is more context pruning from these LLMs. The ability to pull irrelevant parts of the context window as a task is brought into focus.
_boffin_•1h ago
Working on that. hopefully release it by week's end. i'll send you a message when ready.
ashton314•1h ago
Yay more sycophancy. /s

I cannot abide any LLM that tries to be friendly. Whenever I use an LLM to do something, I'm careful to include something like "no filler, no tone-matching, no emotional softening," etc. in the system prompt.

davidguetta•1h ago
WE DONT CARE HOW IT TALKS TO US, JUST WRITE CODE FAST AND SMART
netbioserror•1h ago
Who is "we"?
speedgoose•1h ago
David Guetta, but I didn't know he was also into software development.
astrange•1h ago
Personal requests are 70% of usage

https://www.nber.org/system/files/working_papers/w34255/w342...

cregaleus•1h ago
If you include API usage, personal requests are approximately 0% of total usage, rounded to the nearest percentage.
moralestapia•58m ago
Source: ...
cregaleus•56m ago
Refusal
B56b•20m ago
Oh you meant 0% of your usage, lol
MattRix•56m ago
I don't think this is true. ChatGPT has 800 million active weekly users.
cess11•53m ago
Are you sure about that?

"The share of Technical Help declined from 12% from all usage in July 2024 to around 5% a year later – this may be because the use of LLMs for programming has grown very rapidly through the API (outside of ChatGPT), for AI assistance in code editing and for autonomous programming agents (e.g. Codex)."

Looks like people moving to the API had a rather small effect.

"[T]he three most common ChatGPT conversation topics are Practical Guidance, Writing, and Seeking Information, collectively accounting for nearly 78% of all messages. Computer Programming and Relationships and Personal Reflection account for only 4.2% and 1.9% of messages respectively."

Less than five percent of requests were classified as related to computer programming. Are you really, really sure that like 99% of such requests come from people that are paying for API access?

cregaleus•22m ago
gpt-5.1 is a model. It is not an application, like ChatGPT. I didn't say that personal requests were 0% of ChatGPT usage.

If we are talking about a new model release I want to talk about models, not applications.

The number of input tokens that OpenAI models are processing accross all delivery methods (OpenAI's own APIs, Azure) dwarf the number of input tokens that are coming from people asking the ChatGPT app for personal advice. It isn't close.

url00•1h ago
I don't want a more conversational GPT. I want the _exact_ opposite. I want a tool with the upper limit of "conversation" being something like LCARS from Star Trek. This is quite disappointing as a current ChatGPT subscriber.
nathan_compton•1h ago
You can just tell the AI to not be warm and it will remember. My ChatGPT used the phrase "turn it up to eleven" and I told it never to speak in that manner ever again and its been very robotic ever since.
andai•1h ago
I system-prompted all my LLMs "Don't use cliches or stereotypical language." and they like me a lot less now.
water9•36m ago
They really like to blow sunshine up your ass don’t they? I have to do the same type of stuff. It’s like have to assure that I’m a big boy and I can handle mature content like programming in C
pgsandstrom•1m ago
I added the custom instruction "Please go straight to the point, be less chatty". Now it begins every answer with: "Straight to the point, no fluff:" or something similar. It seems to be perfectly unable to simply write out the answer without some form of small talk first.
moi2388•1h ago
Same. If i tell it to choose A or B, I want it to output either “A” or “B”.

I don’t want an essay of 10 pages about how this is exactly the right question to ask

astrange•1h ago
LLMs have essentially no capability for internal thought. They can't produce the right answer without doing that.

Of course, you can use thinking mode and then it'll just hide that part from you.

LeifCarrotson•1h ago
10 pages about the question means that the subsequent answer is more likely to be correct. That's why they repeat themselves.
binary132•1h ago
citation needed
porridgeraisin•33m ago
First of all, consider asking "why's that?" if you don't know what is a fairly basic fact, no need to go all reddit-pretentious "citation needed" as if we are deeply and knowledgeably discussing some niche detail and came across a sudden surprising fact.

Anyways, a nice way to understand it is that the LLM needs to "compute" the answer to the question A or B. Some questions need more compute to answer (think complexity theory). The only way an LLM can do "more compute" is by outputting more tokens. This is because each token takes a fixed amount of compute to generate - the network is static. So, if you encourage it to output more and more tokens, you're giving it the opportunity to solve harder problems. Apart from humans encouraging this via RLHF, it was also found (in deepseekmath paper) that RL+GRPO on math problems automatically encourages this (increases sequence length).

From a marketing perspective, this is anthropomorphized as reasoning.

From a UX perspective, they can hide this behind thinking... ellipses. I think GPT-5 on chatgpt does this.

angrydev•1h ago
Exactly. Stop fooling people into thinking there’s a human typing on the other side of the screen. LLMs should be incredibly useful productivity tools, not emotional support.
glitchc•46m ago
Maybe there is a human typing on the other side, at least for some parts or all of certain responses. It's not been proven otherwise..
93po•29m ago
Food should only be for sustenance, not emotional support. We should only sell brown rice and beans, no more Oreos.
nikkwong•23m ago
The point the OP is making is that LLMs are not reliably able to provide safe and effective emotional support as has been outlined by recent cases. We're in uncharted territory and before LLMs become emotional companions for people, we should better understand what the risks and tradeoffs are.
halifaxbeard•29m ago
How would you propose we address the therapist shortage then?
93po•29m ago
something something bootstraps
nikkwong•25m ago
Who ever claimed there was a therapist shortage?
abeppu•20m ago
I think therapists in training, or people providing crisis intervention support, can train/practice using LLMs acting as patients going through various kinds of issues. But people who need help should probably talk to real people.
ahmeneeroe-v2•10m ago
outlaw therapy
cowpig•1h ago
I think they get way more "engagement" from people who use it as their friend, and the end goal of subverting social media and creating the most powerful (read: profitable) influence engine on earth makes a lot of sense if you are a soulless ghoul.
sofixa•1h ago
It would be pretty dystopian when we get to the point where ChatGPT pushed (unannounced) advertisements to those people (the ones forming a parasocial relationship with it). Imagine someone complaining they're depressed and ChatGPT proposing doing XYZ activity which is actually a disguised ad.

Other than such scenarios, that "engagement" would be just useless and actually costing them more money than it makes

cowpig•1h ago
Do you have reason to believe they are not doing this already?
sofixa•51m ago
Not really, but with the amounts of money they're bleeding it's bound to get worse if they are already doing it.
water9•35m ago
No, otherwise Sam Altman wouldn’t have had a outburst about revenue. They know that they have this amazing system, but they haven’t quite figured out how to monetize it yet.
vunderba•1h ago
And utterly unsurprising given their announcement last month that they were looking at exploring erotica as a possible revenue stream.

[1] https://www.bbc.com/news/articles/cpd2qv58yl5o

Tiberium•1h ago
Are you aware that you can achieve that by going into Personalization in Settings and choosing one of the presets or just describing how you want the model to answer in natural language?
tekacs•1h ago
That's what the personality selector is for: you can just pick 'Efficient' (formerly Robot) and it does a good job of answering tersely?

https://share.cleanshot.com/9kBDGs7Q

bogtog•1h ago
Unfortunately, I also don't want other people to interact with a sycophantic robot friend, yet my picker only applies to my conversation
coolestguy•1h ago
Sorry that you can't control other peoples lives & wants
alooPotato•1h ago
so good.
EGreg•58m ago
ChatGPT 5.2: allow others to control everything about your conversations. Crowd favorite!
DonaldPShimoda•49m ago
This is like arguing that we shouldn't try to regulate drugs because some people might "want" the heroin that ruins their lives.

The existing "personalities" of LLMs are dangerous, full stop. They are trained to generate text with an air of authority and to tend to agree with anything you tell them. It is irresponsible to allow this to continue while not at least deliberately improving education around their use. This is why we're seeing people "falling in love" with LLMs, or seeking mental health assistance from LLMs that they are unqualified to render, or plotting attacks on other people that LLMs are not sufficiently prepared to detect and thwart, and so on. I think it's a terrible position to take to argue that we should allow this behavior (and training) to continue unrestrained because some people might "want" it.

The_Rob•40m ago
Comparing LLM responses to heroine is insane.
yunohn•38m ago
You’re absolutely right!

The number of heroine addicts is significantly lower than the number of ChatGPT users.

simonw•38m ago
What's your proposed solution here? Are you calling for legislation that controls the personality of LLMs made available to the public?
andy99•27m ago
Pretty sure most of the current problems we see re drug use are a direct result of the nanny state trying to tell people how to live their lives. Forcing your views on people doesn’t work and has lots of negative consequences.
daveguy•6m ago
Okay, I'm intrigued. How in the fuck could the "nanny state" cause people to abuse heroin? Is there a reason other than "just cause it's my ideology".
kivle•27m ago
If only that worked for conversation mode as well. At least for me, and especially when it answers me in Norwegian, it will start off with all sorts of platitudes and whole sentences repeating exactly what I just asked. "Oh, so you want to do x, huh? Here is answer for x". It's very annoying. I just want a robot to answer my question, thanks.
pants2•1m ago
FWIW I didn't like the Robot / Efficient mode because it would give very short answers without much explanation or background. "Nerdy" seems to be the best, except with GPT-5 instant it's extremely cringy like "I'm putting my nerd hat on - since you're a software engineer I'll make sure to give you the geeky details about making rice."

"Low" thinking is typically the sweet spot for me - way smarter than instant with barely a delay.

sbuttgereit•1h ago
This. When I go to an LLM, I'm not looking for a friend, I'm looking for a tool.

Keeping faux relationships out of the interaction never let's me slip into the mistaken attitude that I'm dealing with a colleague rather than a machine.

gcau•1h ago
Yea, I don't want something trying to emulate emotions. I don't want it to even speak a single word, I just want code, unless I explicitly ask it to speak on something, and even in that scenario I want raw bullet points, with concise useful information and no fluff. I don't want to have a conversation with it.

However, being more humanlike, even if it results in an inferior tool, is the top priority because appearances matter more than actual function.

cmrdporcupine•1h ago
To be fair, of all the LLM coding agents, I find Codex+GPT5 to be closest to this.

It doesn't really offer any commentary or personality. It's concise and doesn't engage in praise or "You're absolutely right". It's a little pedantic though.

I keep meaning to re-point Codex at DeepSeek V3.2 to see if it's a product of the prompting only, or a product of the model as well.

Tiberium•11m ago
It is absolutely a product of the model, GPT-5 behaves like this over API even without any extra prompts.
cmrdporcupine•7m ago
I prefer its personality (or lack of it) over Sonnet. And tends to produce less... sloppy code. But it's far slower, and Codex + it suffers from context degradation very badly. If you run a session too long, even with compaction, it starts to really lose the plot.
jasonsb•59m ago
Engagement Metrics 2.0 are here. Getting your answer in one shot is not cool anymore. You need to waste as much time as possible on OpenAI's platform. Enshittification is now more important than AGI.
glouwbug•23m ago
Things really felt great 2023-2024
spaceman_2020•21m ago
This is the AI equivalent of every recipe blog filled with 1000 words of backstory before the actual recipe just to please the SEO Gods

The new boss, same as the old boss

egorfine•17m ago
Enable "Robot" personality. I hate all the other modes.
Szpadel•1h ago
isn't that weird there are no benchmarks included on this release?
qsort•1h ago
I was thinking the same thing. It's the first release from any major lab in recent memory not to feature benchmarks.

It's probably counterprogramming, Gemini 3.0 will drop soon.

bogtog•1h ago
For 5.1-thinking, they show that 90th-percentile-length conversations are have 71% longer reasoning and 10th-percentile-length ones are 57% shorter
emp17344•32m ago
Probably because it’s not that much better than GPT-5 and they want to keep the AI train moving.
gsibble•1h ago
Cool. Now get to work!
cowpig•1h ago
Since Claude and OpenAI made it clear they will be retaining all of my prompts, I have mostly stopped using them. I should probably cancel my MAX subscriptions.

Instead I'm running big open source models and they are good enough for ~90% of tasks.

The main exceptions are Deep Research (though I swear it was better when I could choose o3) and tougher coding tasks (sonnet 4.5)

moi2388•1h ago
Source? You can opt out of training, and delete history, do they keep the prompts somehow?!
astrange•1h ago
It's not simply "training". What's the point of training on prompts? You can't learn the answer to a question by training on the question.

For Anthropic at least it's also opt-in not opt-out afaik.

impossiblefork•1h ago
I think the prompts might actually really useful for training, especially for generating synthetic data.
cowpig•1h ago
1. Anthropic pushed a change to their terms where now I have to opt out or my data will be retained for 5 years and trained on. They have shown that they will change their terms, so I cannot trust them.

2. OpenAI is run by someone who already shows he will go to great lengths to deceive and cannot be trusted, and are embroiled in a battle with the New York Times that is "forcing them" to retain all user prompts. Totally against their will.

simonw•34m ago
The NYT situation concerning data retention was resolved a few weeks ago: https://www.engadget.com/ai/openai-no-longer-has-to-preserve...

> Federal judge Ona T. Wang filed a new order on October 9 that frees OpenAI of an obligation to "preserve and segregate all output log data that would otherwise be deleted on a going forward basis." [...]

> The judge in the case said that any chat logs already saved under the previous order would still be accessible and that OpenAI is required to hold on to any data related to ChatGPT accounts that have been flagged by the NYT.

EDIT: OK looks like I'd missed the news from today at https://openai.com/index/fighting-nyt-user-privacy-invasion/ and discussed here: https://news.ycombinator.com/item?id=45900370

tekacs•1h ago
I'm excited to see whether the instruction following improvements play out in the use of Codex.

The biggest issue I'e seen _by far_ with using GPT models for coding has been their inability to follow instructions... and also their tendency to duplicate-act on messages from up-thread instead of acting on what you just asked for.

ewoodrich•54m ago
I've only had that happen when I use /compact, so I just avoid compacting altogether on Codex/Claude. No great loss and I'm extremely skeptical anyway that the compacted summary will actually distill the specific actionable details I want.
Someone1234•1h ago
Unfortunately no word on "Thinking Mini" getting fixed.

Before GPT-5 was released it used to be a perfect compromise between a "dumb" non-Thinking model and a SLOW Thinking model. However, something went badly wrong within the GPT-5 release cycle, and today it is exactly the same speed (or SLOWER) than their Thinking model even with Extended Thinking enabled, making it completely pointless.

In essence Thinking Mini exists because it is faster than Thinking, but smarter than non-Thinking, but it is dumber than full-Thinking while not being faster.

simonw•36m ago
Which model are you talking about here?
admdly•20m ago
In my opinion I think it’s possible to infer by what has been said[1], and the lack of a 5.1 “Thinking mini” version, that it has been folded into 5.1 Instant with it now deciding when and how much to “think”. I also suspect 5.1 Thinking will be expected to dynamically adapt to fill in the role somewhat given the changes there.

[1] “GPT‑5.1 Instant can use adaptive reasoning to decide when to *think before responding*”

ravenical•1h ago
5.1 Instant is clearly aimed at the people using it for emotional advice etc, but I'm excited about the adaptive reasoning stuff - thinking models are great when you need them, but they take ages to respond sometimes.
ACCount37•1h ago
Despite all the attempts to rein in sycophanty in GPT-5, it was still way too fucking sycophantic as a default.

My main concern is that they're re-tuning it now to make it even MORE sycophantic, because 4o taught them that it's great for user retention.

nlh•1h ago
What's remarkable to me is how deep OpenAI is going on "ChatGPT as communication partner / chatbot", as opposed to Anthropic's approach of "Claude as the best coding tool / professional AI for spreadsheets, etc.".

I know this is marketing at play and OpenAI has plenty of resources developed to advancing their frontier models, but it's starting to really come into view that OpenAI wants to replace Google and be the default app / page for everyone on earth to talk to.

Workaccount2•46m ago
OpenAI said that only ~4% of generated tokens are for programming.

ChatGPT is overwhelmingly, unambiguously, a "regular people" product.

9cb14c1ec0•41m ago
Yes, just look at the stats on OpenRouter. OpenAI has almost totally lost the programming market.
GaggiX•34m ago
OpenRouter probably doesn't mean much given that you can use the OpenAI API directly with the openai library that people use for OpenRouter too.
airstrike•37m ago
I mean, yes, but also because it's not as good as Claude today. Bit of a self fulfilling prophecy and they seem to be measuring the wrong thing.

4% of their tokens or total tokens in the market?

Workaccount2•6m ago
Their tokens, they released a report a few months ago.

However, I can only imagine that OpenAI outputs the most intentionally produced tokens (i.e. the user intentionally went to the app/website) out of all the labs.

mlsu•20m ago
I think this is because Anthropic has principles and OpenAI does not.

Anthropic seems to treat Claude like a tool, whereas OpenAI treats it more like a thinking entity.

In my opinion, the difference between the two approaches is huge. If the chatbot is a tool, the user is ultimately in control; the chatbot serves the user and the approach is to help the user provide value. It's a user-centric approach. If the chatbot is a companion on the other hand, the user is far less in control; the chatbot manipulates the user and the approach is to integrate the chatbot more and more into the user's life. The clear user-centric approach is muddied significantly.

In my view, that is kind of the fundamental difference between these two companies. It's quite significant.

adidoit•1h ago
I think OpenAI and all the other chat LLMs are going to face a constant battle to match personality with general zeitgeist and as the user base expands the signal they get is increasingly distorted to a blah median personality.

It's a form of enshittification perhaps. I personally prefer some of the GPT-5 responses compared to GPT-5.1. But I can see how many people prefer the "warmth" and cloying nature of a few of the responses.

In some sense personality is actually a UX differentiator. This is one way to differentiate if you're a start-up. Though of course OpenAI and the rest will offer several dials to tune the personality.

red2awn•1h ago
Holy em-dash fest in the examples, would have thought they'd augment the training dataset to reduce this behavior.
skrebbel•1h ago
FYI ChatGPT has a “custom instructions” setting in the personalization setting where you can ask it to lay off the idiotic insincere flattery. I recently added this:

> Do not compliment me for asking a smart or insightful question. Directly give the answer.

And I’ve not been annoyed since. I bet that whatever crap they layer on in 5.1 is undone as easily.

fragmede•1h ago
Also "Never apologize."
Terretta•1h ago
Note even today, negation doesn't work as well as affirmative direction.

"Do not use jargon", or, "never apologize", work less well than "avoid jargon" or "avoid apologizing".

Better to give it something to do than something that should be absent (same problem with humans: "don't think of a pink elephant").

See also target fixation: https://en.wikipedia.org/wiki/Target_fixation

Making this headline apropos:

https://www.cycleworld.com/sport-rider/motorcycle-riding-ski...

sethops1•1h ago
Is anyone else tired of chat bots? Really doesn't feel like typing a conversation every interaction is the future of technology.
bonesss•29m ago
Speech to text makes it feel more futuristic.

As does reflecting that Picard had to explain to Computer every, single, time that he wanted his Earl Grey tea ‘hot’. We knew what was coming.

namegulf•1h ago
Doesn't look like it is upgraded, still shows GPT-5 in chatgpt.

Anyone?

ximeng•1h ago
The screenshot of the personality selector for quirky has a typo - imaginitive for imaginative. I guess ChatGPT is not designing itself, yet.
JohnMakin•1h ago
It always boggles my mind when they put out conversation examples before/after patch and the patched version almost always seems lower quality to me.
wewtyflakes•1h ago
Aside from the adherence to the 6-word constraint example, I preferred the old model.
boldlybold•1h ago
Just set it to the "Efficient" tone, let's hope there's less pedantic encouragement of the projects I'm tackling, and less emoji usage.
Terretta•1h ago
As of 20 minutes in, most comments are about "warm". I'm more concerned about this:

> GPT‑5.1 Thinking: our advanced reasoning model, now easier to understand

Oh, right, I turn to the autodidact that's read everything when I want watered down answers.

AaronAPU•1h ago
It sounds patronizing to me.

But Gemini also likes to say things like “as a fellow programmer, I also like beef stew”

nalekberov•1h ago
it's hilarious that they use something about meditation as an example. That's not surprising after all, AI and mediation apps are sold as one-size-fits-all kind of solutions for every modern day problem.
I_am_tiberius•1h ago
The gpt5-pro model hasn't been updated I assume?
arthurcolle•1h ago
Nah they don't do that for the pro models
mrtesthah•1h ago
This thing sounds like Grok now. Gross.
llamasushi•1h ago
"Warmer and more conversational" - they're basically admitting GPT-5 was too robotic. The real tell here is splitting into Instant vs Thinking models explicitly. They've given up on the unified model dream and are now routing queries like everyone else (Anthropic's been doing this, Google's Gemini too).

Calling it "GPT-5.1 Thinking" instead of o3-mini or whatever is interesting branding. They're trying to make reasoning models feel less like a separate product line and more like a mode. Smart move if they can actually make the router intelligent enough to know when to use it without explicit prompting.

Still waiting for them to fix the real issue: the model's pathological need to apologize for everything and hedge every statement lol.

ipsum2•1h ago
I've been using GPT-5.1-thinking for the last month or so, it's been horrendous. It does not spend as much time thinking as GPT-5 does, and the results are significantly worse (e.g. obvious mistakes) and less technical. I suspect this is to save on inference compute.

I've temporarily switched back to o3, thankfully that model is still in the switcher.

knes•1h ago
is this a mishap/ leak? dont see the model yet
mritchie712•1h ago
when 4o was going thru it's ultra-sycophantic phase, I had a talk with it about Graham Hancock (Ancient Apocalypse, alt-history guy).

It agreed with everything Hancock claims with just a little encouragement ("Yes! Bimini road is almost certainly an artifact of Atlantis!")

gpt5 on the other hand will at most say the ideas are "interesting".

timpera•1h ago
I'm really disappointed that they're adding "personality" into the Thinking model. I pay my subscription only for this model, because it's extremely neutral, smart, and straight to the point.
pbiggar•1h ago
I've switched over to https://thaura.ai, which is working on being a more ethical AI. A side effect I hadn't realized is missing the drama over the latest OpenAI changes.
Workaccount2•15m ago
Get them to put a call out of support for LGBTQ+ groups as well and I'll support them. Probably a hard sell to "ethical" people though...
dwa3592•1h ago
altman is creating alternate man. .. thank goodness, I cancelled my subscription after chatgpt5 was launched.
engeljohnb•1h ago
Seems like people here are pretty negative towards a "conversational" AI chatbot.

Chatgpt has a lot of frustrations and ethical concerns, and I hate the sycophancy as much as everyone else, but I don't consider being conversational to be a bad thing.

It's just preference I guess. I understand how someone who mostly uses it as a google replacement or programming tool would prefer something terse and efficient. I fall into the former category myself.

But it's also true that I've dreamed about a computer assistant that can respond to natural language, even real time speech, -- and can imitate a human well enough to hold a conversation -- since I was a kid, and now it's here.

The questions of ethics, safety, propaganda, and training on other people's hard work are valid. It's not surprising to me that using LLMs is considered uncool right now. But having a computer imitate a human really effectively hasn't stopped being awesome to me personally.

I'm not one of those people that treats it like a friend or anything, but its ability to immitate natural human conversation is one of the reasons I like it.

qsort•18m ago
> I've dreamed about a computer assistant that can respond to natural language

When we dreamed about this as kids, we were dreaming about Data from Star Trek, not some chatbot that's been focus grouped and optimized for engagement within an inch of its life. LLMs are useful for many things and I'm a user myself, even staying within OpenAI's offerings, Codex is excellent, but as things stand anthropomorphizing models is a terrible idea and amplifies the negative effects of their sycophancy.

engeljohnb•7m ago
I didn't grow up watching Star Trek, so I'm pretty sure that's not my dream. I pictured something more like Computer from Dexter's Lab. It talks, it appears to understand, it even occassionally cracks jokes and gives sass, it's incredibly useful, but it's not at risk of being mistaken for a human.
thewebguyd•3m ago
Right. I want to be conversational with my computer, I don't want it to respond in a manner that's trying to continue the conversation.

Q: "Hey Computer, make me a cup of tea" A: "Ok. Making tea."

Not: Q: "Hey computer, make me a cup of tea" A: "Oh wow, what a fantastic idea, I love tea don't you? I'll get right on that cup of tea for you. Do you want me to tell you about all the different ways you can make and enjoy tea?"

isusmelj•55m ago
Are there any benchmarks? I didn’t find any. It would be the first model update without proof that it’s better.
TechRemarker•53m ago
Interesting, this seems to be "less" ideal. The problem lately for me is it being to verbose and conversational for things that need not be. Have added custom instructions which helps but still issues. Setting the chat style to "Efficient" more recently did help a lot but has been prone to many more hallucinations, requiring me to constantly ask if they are sure and never responds in a way that yes my latest statement is correct, ignoring it's previous error and showing no sign that it will avoid a similar error further in the conversation. When it constantly makes similar mistakes which I had a way to train my ChatGPT to avoid that, but while adding "memories" helps with somethings, it does not help with certain issues it continues to make since it's programming overrides whatever memory I make for it. Hoping some improvements in 5.1.
1970-01-01•52m ago
Speed, accuracy, cost.

Hit all 3 and you win a boatload of tech sales.

Hit 2/3, and hope you are incrementing where it counts. The competition watches your misses closer than your big hits.

Hit only 1/3 and you're going to lose to competition.

Your target for more conversations better be worth the loss in tech sales.

Faster? Meh. Doesn't seem faster.

Smarter? Maybe. Maybe not. I didn't feel any improvement.

Cheaper? It wasn't cheaper for me, I sure hope it was cheaper for you to execute.

agentifysh•50m ago
will GPT 5.1 make a difference in codex cli? surprised they didn't include any code related benchmarks for it.
xnx•49m ago
Google said in its quarterly call that Gemini 3 is coming this year. Hard to see how OpenAI will keep up.
gmuslera•43m ago
Is this the previous step to the "adult" version announced for next month?
simonw•40m ago
I went looking for the API details, but it's not there until "later this week":

> We’re bringing both GPT‑5.1 Instant and GPT‑5.1 Thinking to the API later this week. GPT‑5.1 Instant will be added as gpt-5.1-chat-latest, and GPT‑5.1 Thinking will be released as GPT‑5.1 in the API, both with adaptive reasoning.

water9•37m ago
I found ChatGPT-5 to be really pedantic in some of it arguments. Often times it’s introductory sentence and thesis sentence would even contradict.
jstummbillig•33m ago
> We’re bringing both GPT‑5.1 Instant and GPT‑5.1 Thinking to the API later this week. GPT‑5.1 Instant will be added as gpt-5.1-chat-latest, and GPT‑5.1 Thinking will be released as GPT‑5.1 in the API, both with adaptive reasoning.

Sooo...

GPT‑5.1 Instant <-> gpt-5.1-chat-latest

GPT‑5.1 Thinking <-> GPT‑5.1

I mean. The shitty naming has to be a pathology or some sort of joke. You can't put thought to that, come up with and think "yeah, absolutely, let's go with that!"

outside1234•32m ago
This model only loses $9B a quarter
AbraKdabra•18m ago
It's a fucking computer, I want results not a therapist.
Dilettante_•15m ago
>GPT‑5.1 Thinking’s responses are also clearer, with less jargon and fewer undefined terms

Oh yeah that's what I want when asking a technical question! Please talk down to me, call a spade an earth-pokey-stick and don't ever use a phrase or concept I don't know because when I come face-to-face with something I don't know yet I feel deep insecurity and dread instead of seeing an opportunity to learn!

But I assume their data shows that this is exactly how their core target audience works.

Better instruction-following sounds lovely though.

plufz•6m ago
I have added a ”language-and-tone.md” in my coding agents docs to make them use less unnecessary jargon and filler words. For me this change sounds good, I like my token count low and my agents language short and succinct. I get what you mean, but I think ai text is often overfilled with filler jargon.

Example from my file:

### Mistake: Using industry jargon unnecessarily

*Bad:*

> Leverages containerization technology to facilitate isolated execution environments

*Good:*

> Runs each agent in its own Docker container

saaaaaam•12m ago
I’ve seen various older people that I’m connected with on Facebook posting screenshots of chats they’ve had with ChatGPT.

It’s quite bizarre from that small sample how many of them take pride in “baiting” or “bantering” with ChatGPT and then post screenshots showing how they “got one over” on the AI. I guess there’s maybe some explanation - feeling alienated by technology, not understanding it, and so needing to “prove” something. But it’s very strange and makes me feel quite uncomfortable.

Partly because of the “normal” and quite naturalistic way they talk to ChatGPT but also because some of these conversations clearly go on for hours.

So I think normies maybe do want a more conversational ChatGPT.

thewebguyd•7m ago
> So I think normies maybe do want a more conversational ChatGPT.

The backlash from GPT-5 proved that. The normies want a very different LLM from what you or I might want, and unfortunately OpenAI seems to be moving in a more direct-to-consumer focus and catering to that.

But I'm really concerned. People don't understand this technology, at all. The way they talk to it, the suicide stories, etc. point to people in general not groking that it has no real understanding or intelligence, and the AI companies aren't doing enough to educate (because why would they, they want you believe it's superintelligence).

These overly conversational chatbots will cause real-world harm to real people. They should reinforce, over and over again to the user, that they are not human, not intelligent, and do not reason or understand.

It's not really the technology itself that's the problem, as is the case with a lot of these things, it's a people & education problem, something that regulators are supposed to solve, but we aren't, we have an administration that is very anti AI regulation all in the name of "we must beat China."

The last-ever penny will be minted today in Philadelphia

https://www.cnn.com/2025/11/12/business/last-penny-minted
330•andrewl•4h ago•470 comments

Project Euler

https://projecteuler.net
110•swatson741•3h ago•31 comments

Steam Machine

https://store.steampowered.com/sale/steammachine
705•davikr•2h ago•353 comments

Steam Frame

https://store.steampowered.com/sale/steamframe
517•Philpax•2h ago•162 comments

Yt-dlp: External JavaScript runtime now required for full YouTube support

https://github.com/yt-dlp/yt-dlp/issues/15012
733•bertman•10h ago•453 comments

GLP-1 drugs linked to lower death rates in colon cancer patients

https://today.ucsd.edu/story/glp-1-drugs-linked-to-dramatically-lower-death-rates-in-colon-cancer...
41•gmays•51m ago•21 comments

Learn Prolog Now

https://lpn.swi-prolog.org/lpnpage.php?pageid=top
200•rramadass•5h ago•117 comments

Launch HN: JSX Tool (YC F25) – A Browser Dev-Panel IDE for React

36•jsunderland323•2h ago•34 comments

Archive or Delete?

https://email-is-good.com/2025/11/05/archive-or-delete/
23•speckx•1w ago•20 comments

Blasting Yeast with UV Light

https://chillphysicsenjoyer.substack.com/p/results-from-blasting-yeast-with
15•Gormisdomai•1h ago•0 comments

Async and Finaliser Deadlocks

https://tratt.net/laurie/blog/2025/async_and_finaliser_deadlocks.html
33•emailed•2h ago•10 comments

Ioannis Yannas invented artificial skin for treatment of burns–dies at 90

https://news.mit.edu/2025/professor-ioannis-yannas-dies-1027
96•bookofjoe•1w ago•5 comments

Valve Announces New Steam Machine, Steam Controller and Steam Frame

https://www.phoronix.com/news/Steam-Machines-Frame-2026
123•doener•2h ago•4 comments

A brief look at FreeBSD

https://yorickpeterse.com/articles/a-brief-look-at-freebsd/
50•todsacerdoti•8h ago•15 comments

Fighting the New York Times' invasion of user privacy

https://openai.com/index/fighting-nyt-user-privacy-invasion
193•meetpateltech•6h ago•202 comments

.NET 10

https://devblogs.microsoft.com/dotnet/announcing-dotnet-10/
432•runesoerensen•1d ago•368 comments

LLM Output Drift in Financial Workflows: Validation and Mitigation (arXiv)

https://arxiv.org/abs/2511.07585
6•raffisk•48m ago•2 comments

How Tube Amplifiers Work

https://robrobinette.com/How_Amps_Work.htm
21•gokhan•2h ago•8 comments

Maestro Technology Sells Used SSD Drives as New

https://kozubik.com/items/MaestroTechnology/
118•walterbell•2h ago•42 comments

Waymo robotaxis are now giving rides on freeways in LA, SF and Phoenix

https://techcrunch.com/2025/11/12/waymo-robotaxis-are-now-giving-rides-on-freeways-in-these-3-cit...
234•nharada•4h ago•272 comments

Is your electric bill going up? AI is partly to blame

https://www.npr.org/2025/11/06/nx-s1-5597971/electricity-bills-utilities-ai
37•ilamont•1h ago•39 comments

Yann LeCun to depart Meta and launch AI startup focused on 'world models'

https://www.nasdaq.com/articles/metas-chief-ai-scientist-yann-lecun-depart-and-launch-ai-start-fo...
761•MindBreaker2605•13h ago•575 comments

The Single Byte That Kills Your Exploit: Understanding Endianness

https://pwnforfunandprofit.substack.com/p/the-single-byte-that-kills-your-exploit
18•andwati•3d ago•5 comments

What happened to Transmeta, the last big dotcom IPO

https://dfarq.homeip.net/what-happened-to-transmeta-the-last-big-dotcom-ipo/
180•onename•11h ago•101 comments

Micro.blog launches new 'Studio' tier with video hosting

https://heydingus.net/blog/2025/11/micro-blog-offers-an-indie-alternative-to-youtube-with-its-stu...
89•justin-reeves•6h ago•27 comments

NetHack4 Philosophy

http://nethack4.org/philosophy.html
53•suioir•1w ago•23 comments

The Geometry Behind Normal Maps

https://www.shlom.dev/articles/geometry-behind-normal-maps/
89•betamark•6h ago•5 comments

Show HN: Cancer diagnosis makes for an interesting RL environment for LLMs

24•dchu17•3h ago•10 comments

UK pauses intelligence-sharing with US on suspected drug vessels in Caribbean

https://www.theguardian.com/uk-news/2025/nov/11/uk-suspends-intelligence-sharing-with-us-amid-air...
77•beardyw•2h ago•22 comments

Building a CI/CD Pipeline Runner from Scratch in Python

https://muhammadraza.me/2025/building-cicd-pipeline-runner-python/
24•mr_o47•3d ago•3 comments