"Sam Altman predicts AGI in 2025" - https://christiankromme.nl/sam-altman-predicts-agi-in-2025/
"Sam Altman says ‘we know how to build AGI’" - https://www.theverge.com/2025/1/6/24337106/sam-altman-says-o...
I had no idea this many people were so attached to a LLM. This sounds absolutely terrible
Those decisions are steered by costs, so it will choose the cheapest (worst) model unless compelled to do otherwise.
Cursor quietly added this type of routing months ago, referring to it only as "automatic" model selection. At the same time, they moved that product to price tiers much more inline with these announced for GPT5.
A_D_E_P_T•6mo ago
Also, GPT-5 pro is a lot slower than o3-pro. My two most recent queries taking 17 and 18 minutes, whereas o3-pro would probably take 4-5.
Surprisingly, at generic writing-prose tasks (e.g. compose a poem in the style of X), GPT-5 is still noticeably inferior to DeepSeek and Kimi.
Honestly, I'm tempted to cancel my $200/month subscription.
This probably means that we're in the "slow AI / stagnation" timeline, so at least we're not going to get paperclipped by 2027.
NitpickLawyer•6mo ago
It doesn't have to be better, it has to be "as close to those as possible", while being cost efficient to run and serve at scale.
> This probably means that we're in the "slow AI / stagnation" timeline
I'd say we're more in the "ok, we got the capabilities in huge models, now we need to make them smaller, faster and scalable" timeline. If they capture ~80-90% of the capabilities of their strongest models while reducing costs a lot (they've gone from 40-60$/Mtok to 10$/Mtok) then they're starting to approach a break-even point and slowly make money off of serving tokens.
There's also a move towards having specialised models (code w/ claude, long context w/ gemini, etc) and oAI seem to have gone in this direction as well. They've said for a long time that gpt5 will be a "systems" update and not necessarily a core model update. That is, they have a routing model that takes a query and routes it towards the best model for the task. Once devs figure out how to use this to their advantage, the "vibes" will improve.
energy123•6mo ago
A_D_E_P_T•6mo ago
I can confirm in head-to-head tests that Kimi is a far better prose stylist. I asked both models to write me a poem in the style of Ezra Pound's Canto I. Kimi's one-shot result was excellent; an "A+" effort at any school in the world, and genuinely professional-poet-tier. (Rather frightening, that.) GPT-5-Pro's result was a disaster that was so bad it verged on parody, and to add insult to injury it sometimes plagiarized Pound's Canto I word-for-word.
I will say that GPT-5 seems a little bit more imaginative and inventive than previous models. It seems slightly better than all other models at formal logic, to the extent that it's a superhuman analytic philosopher. It's also better and faster at searching the web, and it's a little bit more circumspect than usual about the results it shares. (It doesn't blindly promote every product.)
Ultimately this seems like a very incremental upgrade, but it's an upgrade nonetheless.
energy123•6mo ago