GPT5 Is Horrible

https://old.reddit.com/r/ChatGPT/comments/1mkd4l3/gpt5_is_horrible/

29•druskacik•6mo ago

Comments

A_D_E_P_T•6mo ago

For my use cases it doesn't appear to be any better than o3/o3-pro.

Also, GPT-5 pro is a lot slower than o3-pro. My two most recent queries taking 17 and 18 minutes, whereas o3-pro would probably take 4-5.

Surprisingly, at generic writing-prose tasks (e.g. compose a poem in the style of X), GPT-5 is still noticeably inferior to DeepSeek and Kimi.

Honestly, I'm tempted to cancel my $200/month subscription.

This probably means that we're in the "slow AI / stagnation" timeline, so at least we're not going to get paperclipped by 2027.

NitpickLawyer•6mo ago

> For my use cases it doesn't appear to be any better than o3/o3-pro.

It doesn't have to be better, it has to be "as close to those as possible", while being cost efficient to run and serve at scale.

> This probably means that we're in the "slow AI / stagnation" timeline

I'd say we're more in the "ok, we got the capabilities in huge models, now we need to make them smaller, faster and scalable" timeline. If they capture ~80-90% of the capabilities of their strongest models while reducing costs a lot (they've gone from 40-60$/Mtok to 10$/Mtok) then they're starting to approach a break-even point and slowly make money off of serving tokens.

There's also a move towards having specialised models (code w/ claude, long context w/ gemini, etc) and oAI seem to have gone in this direction as well. They've said for a long time that gpt5 will be a "systems" update and not necessarily a core model update. That is, they have a routing model that takes a query and routes it towards the best model for the task. Once devs figure out how to use this to their advantage, the "vibes" will improve.

energy123•6mo ago

Do you still hold the same opinion about GPT-5 Pro now that a day has elapsed and you've had more time to try it out?

A_D_E_P_T•6mo ago

Yeah. In my particular narrow field of expertise -- which, broadly speaking, is a subset of mechanical engineering -- I can say for a certainty that it's nowhere near PhD-level, and it's not perceptively smarter than o3-pro. Maybe it's slightly smarter, but honestly it's hard to tell.

I can confirm in head-to-head tests that Kimi is a far better prose stylist. I asked both models to write me a poem in the style of Ezra Pound's Canto I. Kimi's one-shot result was excellent; an "A+" effort at any school in the world, and genuinely professional-poet-tier. (Rather frightening, that.) GPT-5-Pro's result was a disaster that was so bad it verged on parody, and to add insult to injury it sometimes plagiarized Pound's Canto I word-for-word.

I will say that GPT-5 seems a little bit more imaginative and inventive than previous models. It seems slightly better than all other models at formal logic, to the extent that it's a superhuman analytic philosopher. It's also better and faster at searching the web, and it's a little bit more circumspect than usual about the results it shares. (It doesn't blindly promote every product.)

Ultimately this seems like a very incremental upgrade, but it's an upgrade nonetheless.

energy123•6mo ago

Is there anything GPT-5 is better at than GPT-5 Pro?

JimmyBuckets•6mo ago

The reviews on reddit are overwhelmingly negative. There's no way their QA/UX team didn't catch this drop in performance. The cost savings must be substantial to push this through in spite of this.

gnowlescentic•6mo ago

Softbank is about to claw back 30 billion dollars. OpenAI needs to stop hemorrhaging money. This is a desperate move.

belter•6mo ago

It's Ok. Sam Altman still has five months until the end of the year.

"Sam Altman predicts AGI in 2025" - https://christiankromme.nl/sam-altman-predicts-agi-in-2025/

"Sam Altman says ‘we know how to build AGI’" - https://www.theverge.com/2025/1/6/24337106/sam-altman-says-o...

efilife•6mo ago

https://old.reddit.com/r/ChatGPT/comments/1mkd4l3/gpt5_is_ho...

I had no idea this many people were so attached to a LLM. This sounds absolutely terrible

porridgeraisin•6mo ago

Jeez... that's....interesting to say the least.

throawaywpg•6mo ago

ai ceos are salivating over this stuff. Monetizing people's loneliness and depression is going to be one of the main growth areas for llms

BoiledCabbage•6mo ago

I think that's what this thread is about. A lot of people building characters they interact with now seeing a personality change.

dtagames•6mo ago

It's not the model you're unhappy with; it's the new router function, which attempts to choose the model (using another LLM and RAG).

Those decisions are steered by costs, so it will choose the cheapest (worst) model unless compelled to do otherwise.

Cursor quietly added this type of routing months ago, referring to it only as "automatic" model selection. At the same time, they moved that product to price tiers much more inline with these announced for GPT5.

KV Cache Transform Coding for Compact Storage in LLM Inference

A quantitative, multimodal wearable bioelectronic device for stress assessment

Why Big Tech Is Throwing Cash into India in Quest for AI Supremacy

How to shoot yourself in the foot – 2026 edition

Eight More Months of Agents

From Human Thought to Machine Coordination

The new X API pricing must be a joke

Show HN: RMA Dashboard fast SAST results for monorepos (SARIF and triage)

Show HN: Source code graphRAG for Java/Kotlin development based on jQAssistant

Python Only Has One Real Competitor

Tmux to Zellij (and Back)

Ask HN: How are you using specialized agents to accelerate your work?

Passing user_id through 6 services? OTel Baggage fixes this

DavMail Pop/IMAP/SMTP/Caldav/Carddav/LDAP Exchange Gateway

Visual data modelling in the browser (open source)

Show HN: Tharos – CLI to find and autofix security bugs using local LLMs

Oddly Simple GUI Programs

The New Playbook for Leaders [pdf]

Interactive Unboxing of J Dilla's Donuts

OneCourt helps blind and low-vision fans to track Super Bowl live

Rudolf Vrba

Autism Incidence in Girls and Boys May Be Nearly Equal, Study Suggests

Wellness Hotels Discovery Application

NASA delays moon rocket launch by a month after fuel leaks during test

Sebastian Galiani on the Marginal Revolution

Ask HN: Are we at the point where software can improve itself?

Binance Gives Trump Family's Crypto Firm a Leg Up

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

Indian Culture

Show HN: Maravel-Framework 10.61 prevents circular dependency