OpenAI researcher announced GPT-5 math breakthrough that never happened

https://the-decoder.com/leading-openai-researcher-announced-a-gpt-5-math-breakthrough-that-never-happened/

79•Topfi•1h ago

Comments

amelius•37m ago

> Summary (from the article)

* OpenAI researchers claimed or suggested that GPT-5 had solved unsolved math problems, but in reality, the model only found known results that were unfamiliar to the operator of erdosproblems.com.

* Mathematician Thomas Bloom and Deepmind CEO Demis Hassabis criticized the announcement as misleading, leading the researchers to retract or amend their original claims.

* According to mathematician Terence Tao, AI models like GPT-5 are currently most helpful for speeding up basic research tasks such as literature review, rather than independently solving complex mathematical problems.

jgalt212•35m ago

After the circular financing schemes involving hundreds of billions of dollars were uncovered, nothing I read about the AI business and its artificial hype machine surprises me anymore.

bbor•34m ago

This is just tit-for-tat clickbait. The researcher’s wording was a bit unclear for sure, but far from incorrect.

resoluteteeth•28m ago

I disagree. There is no way to interpret "GPT-5 just found solutions to 10 (!) previously unsolved Erdos problems" as saying something other than GPT-5 having solved them.

If it just found existing solutions then they obviously weren't "previously unsolved" so the tweet is wrong.

He clearly misunderstood the situation and jumped to the conclusion that GPT-5 had actually solved the problems because that's what he wanted to believe.

That said, the misunderstanding is understandable because the tweet he was responding to said they had been listed as "open", but solving unsolved erdos problems by itself would be such a big deal that he probably should have double checked it.

ayhanfuat•28m ago

How is “GPT-5 found solutions to 10 previously unsolved Erdös problems” far from incorrect?

andrewstuart•30m ago

Humans hallucinating about AI.

MattGaiser•2m ago

Humans "hallucinate" in the AI way constantly, which is why I don't see them as a barrier to LLMs replacing humans in many contexts. It really isn't unusual for a human to make stuff up or be unaware of stuff.

random9749832•2m ago

Best case: Hallucination

Worst case (more probable): Lying

mentalgear•27m ago

Another instance of openAI manipulating results to prolong their unsustainable circular hype bubble.

The inevitable collapse could be even more devastating than the 2008 financial crisis.

All while so vast resources are being wasted on non-verifiable gen AI slob, while real approaches (neuro-symbolic like DeepMind's AlphaFold) are mostly ignored financially because they don't generate the quick stock market increases that hype does.

the_duke•16m ago

People keep spouting this, but I don't see how the AI bubble bursting would be all that devastating.

2008 was a systemic breakdown rippling through the foundations of the financial system.

It would lead to a market crash (80% of gains this year were big tech/AI) and likely a full recession in the US, but nothing nearly as dramatic as a global systemic crisis.

In contrast to the dot com bubble, the huge AI spending is also concentrated on relatively few companies, many with deep pockets from other revenue sources (Google, Meta, Microsoft, Oracle), and the others are mostly private companies that won't have massive impact on the stock market.

A sudden stop in AI craze would be hard for hardware companies and a few big AI only startups , but the financial fallout would be much more contained than either dot com or 2008.

Topfi•7m ago

Isn't the dot com bubble a far better proxy? Notably, todays spending is both higher and more concentrated in a few companies that a large part of the population has exposure to (most dot com companies weren't publicly traded and far smaller vs MSFT, Alphabet, Meta, Oracle, NVDA making up most investment today) by way of pension funds, ETFs, etc.?

kif•26m ago

This honestly doesn’t surprise me. We have reached a point where it’s becoming clearer and clearer that AGI is nowhere to be seen, whereas advances in LLM ability to ‘reason’ have slowed down to (almost?) a halt.

dawnerd•8m ago

But if you ask an AI hype person they’ll say we’re almost there we just need a bit more gigawatts of compute!

vbezhenar•7m ago

In my book, chat-based AGI has been reached years ago, when I couldn't reliably distinguish computer from human.

Solving problems that humanity couldn't solve is super-AGI or something like that. It's not there indeed.

steveBK123•2m ago

Hence the pivot into ads, shop-in-chat and umm.. adult content.

Analemma_•10m ago

“AGI achieved internally”

Another case of culture flowing from the top I guess.

strangescript•6m ago

This entire thing has been pretty disingenuous on both sides of the fence. All the anti-AI (or anti OpenAI) people are doing victory laps, but what GPT-5 Pro did is still very valuable.

1) What good is your open problem set if really its a trivial "google search" away from being solved. Why are they not catching any blame here?

2) These answers still weren't perfectly laid out for the most part. GPT-5 was still doing some cognitive lifting to piece it together.

If a human would have done this by hand it would have made news and instead the narrative would have been inverted to ask serious questions about the validity of some these style problem sets and/or ask the question how many other solutions are out there that just need pieced together from pre-existing research.

But, you know, AI Bad.

nurettin•1m ago

AI great, but AI not creative, yet.

random9749832•3m ago

You are telling me a language model trained on Reddit can't solve novel problems? Shocking.

Teen sues to destroy the nudify app that left her in constant fear

Pathmetrics

Feed me up, Scotty – custom RSS feed generation using CSS selectors

Cyborgs vs. rooms, two visions for the future of computing

Geometric Kernel for Digital Manufacturing

Warning: Gmail client Show Original can omit lines of the original

A statistic is as a statistic does

Warning: Gmail client can misrepresent bounce message content

I invited strangers to message me through a receipt printer

Extrapolating Quantum Factoring

Prefer TF Module Composition over Inheritance

'How Life Works' by Philip Ball: Beware metaphorical systems thinking in biology

I Tried to Toughen Up My Son. Things Didn't Go as Planned

The Planets, Personified

Update on Plans for Privacy Sandbox Technologies (Google Chrome)

The MCP Server?

Deck Dialogues: How Alex Transformed His UConsole into a Botnet Simulator

The macOS LC_COLLATE hunt: Or why does sort order differently on macOS and Linux

Single-Step Electrochemical Battery Recycling

Jason Wei on 3 Key Ideas in AI in 2025 [video]

Is MCP authentication that complicated?

Ask HN: Will Biotech Have Its "Arduino Moment"?

Show HN: Syna – Minimal ML and RL Framework Built from Scratch with NumPy

Fixing AWS Architecture Diagrams: AI Document Processing

The Beauty of Batteries

A Tower on Billionaires' Row Is Full of Cracks. Who's to Blame?

Finland proposes €200B Russian asset transfer to Ukraine

Ask HN: Why put a HAProxy in front of Nginx?

Anthropic's Jack Clark is drawing White House ire

Creating an Igcse Pseudocode Interpreter