frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

No Here on Slack

https://noathere.org/
1•jcmuller•1m ago•0 comments

Humans as Constancy Anchors: A Response to 'Something Big Is Happening'

1•mrev2•3m ago•0 comments

Show HN: ARA-Engine – Modeling the Alberta power grid transition in Python

https://github.com/ada33934/ARA-Engine
1•ada33934•4m ago•0 comments

The AI hater's guide to code with LLMs

https://aredridel.dinhe.net/2026/02/12/the-ai-haters-guide-to-code-with-llms/
2•speckx•7m ago•0 comments

Show HN: An MCP server that gives AI assistants a live Mermaid diagram canvas

https://github.com/iishyfishyy/mermaid-live-mcp
1•ishyfishyy•7m ago•0 comments

Ask HN: Are there examples of 3D printing data onto physical surfaces?

1•catapart•8m ago•0 comments

Fair Weather

https://fair-weather.query-farm.services
2•rustyconover•9m ago•1 comments

OpenAI GPT-5.3-Codex-Spark Now Running at 1K Tokens per Second on Cerebras Chips

https://www.servethehome.com/openai-gpt-5-3-codex-spark-now-running-at-1k-tokens-per-secondon-big...
1•rbanffy•10m ago•0 comments

Mars and Life

https://twitter.com/nasamars/status/2022001374154854471
1•paulpauper•11m ago•0 comments

Oracle vs. PostgreSQL – Row level and Column level security

https://hexacluster.ai/blog/row-level-and-column-level-security-oracle-vs-postgresql
1•avivallssa•11m ago•0 comments

Apple faces new tensions with Trump administration

https://www.ft.com/content/0c25de53-4668-4ddf-9e28-f8c4fc34940e
2•ksec•11m ago•1 comments

Japan's National Chip Startup Races to 2nm Mass Production

https://www.ai-supremacy.com/p/japans-national-chip-startup-races-2nm-rapidus
1•rbanffy•11m ago•0 comments

Show HN: Moatifi – Free Buffett-style moat analysis for stocks

https://moatifi.com
1•lldougl•12m ago•1 comments

What Happened with Bio Anchors?

https://www.astralcodexten.com/p/what-happened-with-bio-anchors
1•paulpauper•12m ago•0 comments

Strawmen and Worldview Solipsism

https://cognitivewonderland.substack.com/p/strawmen-and-worldview-solipsism
1•paulpauper•12m ago•0 comments

Generalized On-Policy Distillation with Reward Extrapolation

https://arxiv.org/abs/2602.12125
1•fzliu•14m ago•0 comments

NIMBYs Complained About Me to the State Bar. The State Bar Told Them to Get Lost

https://inpractice.yimbyaction.org/p/nimbys-complained-about-me-to-the
1•luckyducky99•14m ago•0 comments

Commerzbank Joins European Payment App Wero

https://www.marketscreener.com/news/commerzbank-joins-european-payment-app-wero-ce7e5adddd8bff21
1•toomuchtodo•16m ago•1 comments

Show HN: Tide Commander – Visual Agents Orchestrator for Claude Code and Codex

https://github.com/deivid11/tide-commander
1•deivid11•16m ago•0 comments

Molly Guard in Reverse

https://unsung.aresluna.org/molly-guard-in-reverse/
1•CharlesW•17m ago•0 comments

Virginia court allows Democrats' redistricting vote in plan to counter to Trump

https://www.npr.org/2026/02/13/nx-s1-5711630/virginia-court-allows-democrats-redistricting-vote-i...
1•rbanffy•17m ago•0 comments

I built Fluxer, a Discord-like chat app

https://blog.fluxer.app/how-i-built-fluxer-a-discord-like-chat-app/
1•birdculture•18m ago•0 comments

I don't know what a function is. But I built a production SaaS anyway

https://articlefoundry.com
1•adambuildstuff•19m ago•2 comments

What have you been working on and AI is replacing you?

2•l0new0lf-G•19m ago•0 comments

More macOS 26.3 Finder column view silliness

https://lapcatsoftware.com/articles/2026/2/4.html
2•ksec•20m ago•0 comments

Inlay – Make your website discoverable by AI agents

https://www.inlay.dev/
1•jferdizzle•23m ago•1 comments

The Quiet World (1998)

https://www.poetryfoundation.org/poems/49238/the-quiet-world
1•jihadjihad•23m ago•0 comments

I Chose Discourse over Discord

https://blog.discourse.org/2025/10/on-building-communities-in-public-why-i-chose-discourse-over-d...
1•speckx•23m ago•0 comments

The Professor of the Lower Senses

https://www.vittlesmagazine.com/p/the-professor-of-the-lower-senses
1•benbreen•25m ago•0 comments

Why America Never Got a Labor Party

https://jacobin.com/2026/02/labor-parties-social-democracy-american-exceptionalism/
1•PaulHoule•26m ago•1 comments
Open in hackernews

GPT-5.2 derives a new result in theoretical physics

https://openai.com/index/new-result-theoretical-physics/
167•davidbarker•1h ago

Comments

Insanity•1h ago
They also claimed ChatGPT solved novel erdös problems when that wasn’t the case. Will take with a grain of salt until more external validation happened. But very cool if true!
vonneumannstan•1h ago
Wasnt that like some marketing bro? This is coming out the front door with serious physicists attached.
famouswaffles•1h ago
Well they (OpenAI) never made such a claim. And yes, LLMs have made unique solutions/contributions to a few erdos problems.
smokel•1h ago
How was that not the case? As far as I understand it ChatGPT was instrumental to solving a problem. Even if it did not entirely solve it by itself, the combination with other tools such as Lean is still very impressive, no?
emil-lp•1h ago
It didn't solve it, it simply found that it had been solved in a publication and that the list of open problems wasn't updated.
Davidzheng•1h ago
My understanding is there's been around 10 erdos problems solved by GPT by now. Most of them have been found to be either in literature or a very similar problem was solved in literature. But one or two solutions are quite novel.

https://github.com/teorth/erdosproblems/wiki/AI-contribution... may be useful

emp17344•23m ago
Some of these were initially hyped as novel solutions, and then were quietly downgraded after it was discovered the solutions weren’t actually novel.
crorella•1h ago
The preprint: https://arxiv.org/abs/2602.12176
vonneumannstan•1h ago
Interesting considering the Twitter froth recently about AI being incapable in principle of discovering anything.
baq•40m ago
Anything but recent.
pruufsocial•1h ago
All I saw was gravitons and thought we’re finally here the singularity has begun
Davidzheng•1h ago
"An internal scaffolded version of GPT‑5.2 then spent roughly 12 hours reasoning through the problem, coming up with the same formula and producing a formal proof of its validity."

When I use GPT 5.2 Thinking Extended, it gave me the impression that it's consistent enough/has a low enough rate of errors (or enough error correcting ability) to autonomously do math/physics for many hours if it were allowed to [but I guess the Extended time cuts off around 30 minute mark and Pro maybe 1-2 hours]. It's good to see some confirmation of that impression here. I hope scientists/mathematicians at large will be able to play with tools which think at this time-scale soon and see how much capabilities these machines really have.

mmaunder•1h ago
Yes and 5.3 and the latest codex cli client is incredibly good across compactions. Anyone know the methodology they're using to maintain state and manage context for a 12 hour run? It could be as simple as a single dense document and its own internal compaction algrorithm, I guess.
knicholes•41m ago
https://developers.openai.com/cookbook/articles/codex_exec_p... might be what you're looking for
slopusila•26m ago
after those 30 min you can manually ask it again to continue working on the problem
outlace•1h ago
The headline may make it seem like AI just discovered some new result in physics all on its own, but reading the post, humans started off trying to solve some problem, it got complex, GPT simplified it and found a solution with the simpler representation. It took 12 hours for GPT pro to do this. In my experience LLM’s can make new things when they are some linear combination of existing things but I haven’t been to get them to do something totally out of distribution yet from first principles.
emil-lp•1h ago
"GPT did this". Authored by Guevara (Institute for Advanced Study), Lupsasca (Vanderbilt University), Skinner (University of Cambridge), and Strominger (Harvard University).

Probably not something that the average GI Joe would be able to prompt their way to...

I am skeptical until they show the chat log leading up to the conjecture and proof.

famouswaffles•1h ago
The paper has all those prominent institutions who acknowledge the contribution so realistically, why would you be skeptical ?
kristopolous•58m ago
they probably also acknowledge pytorch, numpy, R ... but we don't attribute those tools as the agent who did the work.

I know we've been primed by sci-fi movies and comic books, but like pytorch, gpt-5.2 is just a piece of software running on a computer instrumented by humans.

famouswaffles•55m ago
I don't see the authors of those libraries getting a credit on the paper, do you ?

>I know we've been primed by sci-fi movies and comic books, but like pytorch, gpt-5.2 is just a piece of software running on a computer instrumented by humans.

Sure

name_taken_duh•52m ago
And we are just a system running on carbon-based biology in our physics computer run by whomever. What makes us special, to say that we are different than GPT-5.2?
palmotea•33m ago
> And we are just a system running on carbon-based biology in our physics computer run by whomever. What makes us special, to say that we are different than GPT-5.2?

Do you really want to be treated like an old PC (dismembered, stripped for parts, and discarded) when your boss is done with you (i.e. not treated specially compared to a computer system)?

But I think if you want a fuller answer, you've got a lot of reading to do. It's not like you're the first person in the world to ask that question.

kristopolous•18m ago
It's always a value decision. You can say shiny rocks are more important than people and worth murdering over.

Not an uncommon belief.

Here you are saying you personally value a computer program more than people

It exposes a value that you personally hold and that's it

That is separate from the material reality that all this AI stuff is ultimately just computer software... It's an epistemological tautology in the same way that say, a plane, car and refrigerator are all just machines - they can break, need maintenance, take expertise, can be dangerous...

LLMs haven't broken the categorical constraints - you've just been primed to think such a thing is supposed to be different through movies and entertainment.

I hate to tell you but most movie AIs are just allegories for institutional power. They're narrative devices about how callous and indifferent power structures are to our underlying shared humanity

Refreeze5224•57m ago
Their point is, would you be able to prompt your way to this result? No. Already trained physicists working at world-leading institutions could. So what progress have we really made here?
famouswaffles•52m ago
It's a stupid point then. Are you able to work with a world leading physicist to any significant degree? No
emil-lp•15m ago
It's like saying: calculator drives new result in theoretical physics

(In the hands of leading experts.)

famouswaffles•6m ago
No it's not like saying that at all, which is why Open AI have a credit on the paper.
Sharlin•45m ago
I'm a big LLM sceptic but that's… moving the goalposts a little too far. How could an average Joe even understand the conjecture enough to write the initial prompt? Or do you mean that experts would give him the prompt to copy-paste, and hope that the proverbial monkey can come up with a Henry V? At the very least posit someone like a grad student in particle physics.
slopusila•31m ago
hey, GPT, solve this tough conjecture I've read about on Quanta. make no mistakes
co_king_3•20m ago
make no mistakes *please*
terminalbraid•10m ago
"Hey GPT thanks for the result. But is it actually true?"
buttered_toast•29m ago
I would interpret it as implying that the result was due to a lot more hand-holding that what is let on.

Was the initial conjecture based on leading info from the other authors or was it simply the authors presenting all information and asking for a conjecture?

Did the authors know that there was a simpler means of expressing the conjecture and lead GPT to its conclusion, or did it spontaneously do so on its own after seeing the hand-written expressions.

These aren't my personal views, but there is some handwaving about the process in such a way that reads as if this was all spontaneous involvement on GPTs end.

But regardless, a result is a result so I'm content with it.

lamontcg•9m ago
That's kinda the whole point.

SpaceX can use an optimization algorithm to hoverslam a rocket booster, but the optimization algorithm didn't really figure it out on its own.

The optimization algorithm was used by human experts to solve the problem.

hgfda•14m ago
Lupsasca is at OpenAI:

https://lupsasca.com/

Certainly the result looks very much desired by an OpenAI researcher.

bpodgursky•1h ago
I don't want to be rude but like, maybe you should pre-register some statement like "LLMs will not be able to do X" in some concrete domain, because I suspect your goalposts are shifting without you noticing.

We're talking about significant contributions to theoretical physics. You can nitpick but honestly go back to your expectations 4 years ago and think — would I be pretty surprised and impressed if an AI could do this? The answer is obviously yes, I don't really care whether you have a selective memory of that time.

nozzlegear•1h ago
> We're talking about significant contributions to theoretical physics.

Whoever wrote the prompts and guided ChatGPT made significant contributions to theoretical physics. ChatGPT is just a tool they used to get there. I'm sure AI-bloviators and pelican bike-enjoyers are all quite impressed, but the humans should be getting the research credit for using their tools correctly. Let's not pretend the calculator doing its job as a calculator at the behest of the researcher is actually a researcher as well.

famouswaffles•1h ago
If this worked for 12 hours to derive the simplified formula along with its proof then it guided itself and made significant contributions by any useful definition of the word, hence Open AI having an author credit.
nozzlegear•55m ago
> hence Open AI having an author credit.

How much precedence is there for machines or tools getting an author credit in research? Genuine question, I don't actually know. Would we give an author credit to e.g. a chimpanzee if it happened to circle the right page of a text book while working with researchers, leading them to a eureka moment?

famouswaffles•44m ago
>How much precedence is there for machines or tools getting an author credit in research?

Well what do you think ? Do the authors (or a single symbolic one) of pytorch or numpy or insert <very useful software> typically get credits on papers that utilize them heavily? Well Clearly these prominent institutions thought GPT's contribution significant enough to warrant an Open AI credit.

>Would we give an author credit to e.g. a chimpanzee if it happened to circle the right page of a text book while working with researchers, leading them to a eureka moment?

Cool Story. Good thing that's not what happened so maybe we can do away with all these pointless non sequiturs yeah ? If you want to have a good faith argument, you're welcome to it, but if you're going to go on these nonsensical tangents, it's best we end this here.

nozzlegear•25m ago
> Well what do you think ? Do the authors (or a single symbolic one) of pytorch or numpy or insert <very useful software> typically get credits on papers that utilize them heavily ?

I don't know! That's why I asked.

> Well Clearly these prominent institutions thought GPT's contribution significant enough to warrant an Open AI credit.

Contribution is a fitting word, I think, and well chosen. I'm sure OpenAI's contribution was quite large, quite green and quite full of Benjamins.

> Cool Story. Good thing that's not what happened so maybe we can do away with all these pointless non sequiturs yeah ? If you want to have a good faith argument, you're welcome to it, but if you're going to go on these nonsensical tangents, it's best we end this here.

It was a genuine question. What's the difference between a chimpanzee and a computer? Neither are humans and neither should be credited as authors on a research paper, unless the institution receives a fat stack of cash I guess. But alas Jane Goodall wasn't exactly flush with money and sycophants in the way OpenAI currently is.

famouswaffles•18m ago
>I don't know! That's why I asked.

If you don't read enough papers to immediately realize it is an extremely rare occurrence then what are you even doing? Why are you making comments like you have the slightest clue of what you're talking about? including insinuating the credit was what...the result of bribery?

You clearly have no idea what you're talking about. You've decided to accuse prominent researchers of essentially academic fraud with no proof because you got butthurt about a credit. You think your opinion on what should and shouldn't get credited matters ? Okay

I've wasted enough time talking to you. Good Day.

nozzlegear•6m ago
Do I need to be credentialed to ask questions or point out the troubling trend of AI dystopia maxxers like yourself helping Sam Altman and his cronies further the myth of AGI by pretending a machine is a researcher deserving of a research credit? This is marketing, pure and simple.
kuboble•41m ago
I have seem stuff like "you can use my program if you will make me a co-author".

That usually comes up with some support usually.

floxy•27m ago
>How much precedence is there for machines or tools getting an author credit in research?

For a datum of one, the mathematician Doron Zeilberger give credit to his computer Shalosh B. Ekhad on select papers.

https://medium.com/@miodragpetkovic_24196/the-computer-a-mys...

https://sites.math.rutgers.edu/~zeilberg/akherim/EkhadCredit...

https://sites.math.rutgers.edu/~zeilberg/pj.html

nozzlegear•13m ago
Interesting (and an interesting name for the computer too), thanks!
slopusila•27m ago
it's called ethics and research integrity. not crediting GPT would be a form of misrepresentation
nozzlegear•23m ago
Would it? I think there's a difference between "the researchers used ChatGPT" and "one of the researchers literally is ChatGPT." The former is the truth, and the latter is the misrepresentation in my eyes.

I have no problem with the former and agree that authors/researchers must note when they use AI in their research.

slopusila•20m ago
now you are debating exactly how GPT should be credited. idk, I'm sure the field will make up some guidance

for this particular paper it seems the humans were stuck, and only AI thinking unblocked them

nozzlegear•10m ago
> now you are debating exactly how GPT should be credited. idk, I'm sure the field will make up some guidance

In your eyes maybe there's no difference. In my eyes, big difference. Tools are not people, let's not further the myth of AGI or the silly marketing trend of anthropomorphizing LLMs.

steveklabnik•22m ago
Not exactly the same thing, but I know of at least two professors that would try to list their cats as co-authors:

https://en.wikipedia.org/wiki/F._D._C._Willard

https://en.wikipedia.org/wiki/Yuri_Knorozov

nozzlegear•9m ago
That is great, thank you!
bpodgursky•54m ago
If a helicopter drops someone off on the top of Mount Everest, it's reasonable to say that the helicopter did the work and is not just a tool they used to hike up the mountain.
nozzlegear•51m ago
Who piloted the helicopter in this scenario, a human or chatgpt? You'd say the pilot dropped them off in a helicopter. The helicopter didn't fly itself there.
bpodgursky•42m ago
“They have chosen cunning instead of belief. Their prison is only in their minds, yet they are in that prison; and so afraid of being taken in that they cannot be taken out.”

― C.S. Lewis, The Last Battle

RandomLensman•55m ago
I don't know enought about theoretical physics: what makes it a significant contribution there?
epolanski•31m ago
Not every contribution has immediate impact.
terminalbraid•25m ago
That doesn't answer the question. That statement just admits "maybe" which isn't helpful or insightful to answering it.
terminalbraid•23m ago
It's a nontrivial calculation valid for a class of forces (e.g. QCD) and apparently a serious simplification to a specific calculation that hadn't been completed before. But for what it's worth, I spent a good part of my physics career working in nucleon structure and have not run across the term "single minus amplitudes" in my memory. That doesn't necessarily mean much as there's a very broad space work like this takes place in and some of it gets extremely arcane and technical.

One way I gauge the significance of a theory paper are the measured quantities and physical processes it would contribute to. I see none discussed here which should tell you how deep into math it is. I personally would not have stopped to read it on my arxiv catch-up

https://arxiv.org/list/hep-th/new

Maybe to characterize it better, physicists were not holding their breath waiting for this to get done.

RandomLensman•18m ago
Thank you!
outlace•40m ago
I never said LLMs will not be able to do X. I gave my summary of the article and my anecdotal experiences with LLMs. I have no LLM ideology. We will see what tomorrow brings.
CGMthrowaway•1h ago
This is the critical bit (paraphrasing):

Humans have worked out the amplitudes for integer n up to n = 6 by hand, obtaining very complicated expressions, which correspond to a “Feynman diagram expansion” whose complexity grows superexponentially in n. But no one has been able to greatly reduce the complexity of these expressions, providing much simpler forms. And from these base cases, no one was then able to spot a pattern and posit a formula valid for all n. GPT did that.

Basically, they used GPT to refactor a formula and then generalize it for all n. Then verified it themselves.

I think this was all already figured out in 1986 though: https://journals.aps.org/prl/abstract/10.1103/PhysRevLett.56... see also https://en.wikipedia.org/wiki/MHV_amplitudes

ericmay•37m ago
Still pretty awesome though, if you ask me.
fsloth•27m ago
I think even “non-intelligent” solver like Mathematica is cool - so hell yes, this is cool.
_aavaa_•7m ago
Big difference between “drives new result” and “reproduces something likely in its training dataset”.
woeirua•26m ago
You should probably email the authors if you think that's true. I highly doubt they didn't do a literature search first though...
emp17344•19m ago
You should be more skeptical of marketing releases like this. This is an advertisement.
btown•5m ago
It bears repeating that modern LLMs are incredibly capable, and relentless, at solving problems that have a verification test suite. It seems like this problem did (at least for some finite subset of n)!

This result, itself does not generalize to open-ended problems, whether in business or in research in general. Discovering the specification to build is often the majority of the battle. LLMs aren't bad at this, per se, but they're nowhere near as reliably groundbreaking as they are on verifiable problems.

bottlepalm•1h ago
Is every new thing not just combinations of existing things? What does out of distribution even mean? What advancement has ever made that there wasn’t a lead up of prior work to it? Is there some fundamental thing that prevents AI from recombining ideas and testing theories?
outlace•42m ago
For example, ever since the first GPT 4 I’ve tried to get LLM’s to build me a specific type of heart simulation that to my knowledge does not exist anywhere on the public internet (otherwise I wouldn’t try to build it myself) and even up to GPT 5.3 it still cannot do it.

But I’ve successfully made it build me a great Poker training app, a specific form that also didn’t exist, but the ingredients are well represented on the internet.

And I’m not trying to imply AI is inherently incapable, it’s just an empirical (and anecdotal) observation for me. Maybe tomorrow it’ll figure it out. I have no dogmatic ideology on the matter.

fpgaminer•31m ago
> Is every new thing not just combinations of existing things?

If all ideas are recombinations of old ideas, where did the first ideas come from? And wouldn't the complexity of ideas be thus limited to the combined complexity of the "seed" ideas?

I think it's more fair to say that recombining ideas is an efficient way to quickly explore a very complex, hyperdimensional space. In some cases that's enough to land on new, useful ideas, but not always. A) the new, useful idea might be _near_ the area you land on, but not exactly at. B) there are whole classes of new, useful ideas that cannot be reached by any combination of existing "idea vectors".

Therefore there is still the necessity to explore the space manually, even if you're using these idea vectors to give you starting points to explore from.

All this to say: Every new thing is a combination of existing things + sweat and tears.

The question everyone has is, are current LLMs capable of the latter component. Historically the answer is _no_, because they had no real capacity to iterate. Without iteration you cannot explore. But now that they can reliably iterate, and to some extent plan their iterations, we are starting to see their first meaningful, fledgling attempts at the "sweat and tears" part of building new ideas.

verdverm•55m ago
They want to be seen alongside the big news from their competitors, so it looks less like they are falling behind, let me know when they have a passing pre-train run again (apparently haven't since Ilya left)
buttered_toast•49m ago
Absolutely no way this is true right? Ilya left around the time 4o was released. I can't imagine they haven't had a single successful run since then.
verdverm•30m ago
When's the last time they talked about it?

I heard this from people who know more than me

buttered_toast•27m ago
Can't say, just seems implausible, but I am a nobody anyways ¯\_(ツ)_/¯
ctoth•46m ago
In my experience humans can make new things when they are some linear combination of existing things but I haven’t been able to get them to do something totally out of distribution yet from first principles[0].

[0]: https://slatestarcodex.com/2019/02/19/gpt-2-as-step-toward-g...

randomtoast•41m ago
> but I haven’t been to get them to do something totally out of distribution yet from first principles

Can humans actually do that? Sometimes it appears as if we have made a completely new discovery. However, if you look more closely, you will find that many events and developments led up to this breakthrough, and that it is actually an improvement on something that already existed. We are always building on the shoulders of giants.

dotancohen•37m ago
Relativity comes to mind.

You could nitpick a rebuttal, but no matter how many people you give credit, general relativity was a completely novel idea when it was proposed. I'd argue for special relatively as well.

johnfn•32m ago
Even if I grant you that, surely we’ve moved the goal posts a bit if we’re saying the only thing we can think of that AI can’t do is the life’s work of a man who’s last name is literally synonymous with genius.
poplarsol•16m ago
That's not exactly true. Lorentz contraction is a clear antecedent to special relativity.
lamontcg•13m ago
Not really. Pretty sure I read recently that Newton appreciated that his theory was non-local and didn't like what Einstein later called "spooky action at a distance". The Lorentz transform was also known from 1887. Time dilation was understood from 1900. Poincaré figured out in 1905 that it was a mathematical group. Einstein put a bow on it all by figuring out that you could derive it from the principle of relativity and keeping the speed of light constant in all inertial reference frames.

I'm not sure about GR, but I know that it is built on the foundations of differential geometry, which Einstein definitely didn't invent (I think that's the source of his "I assure you whatever your difficulties in mathematics are, that mine are much greater" quote because he was struggling to understand Hilbert's math).

And really Cauchy, Hilbert, and those kinds of mathematicians I'd put above Einstein in building entirely new worlds of mathematics...

CooCooCaCha•21m ago
Depends on what you think is valid.

The process you’re describing is humans extending our collective distribution through a series of smaller steps. That’s what the “shoulders of giants” means. The result is we are able to do things further and further outside the initial distribution.

So it depends on if you’re comparing individual steps or just the starting/ending distributions.

tjr•12m ago
Go enough shoulders down, and someone had to have been the first giant.
pram•7m ago
Pythagoras is the turtle.
epolanski•33m ago
Serious questions, I often hear about this "let the LLM cook for hours" but how do you do that in practice and how does it manages its own context? How doesn't it get lost at all after so many tokens?
javier123454321•30m ago
From what I've seen is a process of compacting the session once it reaches some limit, which basically means summarizing all the previous work and feeding it as the initial prompt for the next session.
lovecg•29m ago
I’m guessing, would love someone who has first hand knowledge to comment. But my guess is it’s some combination of trying many different approaches in parallel (each in a fresh context), then picking the one that works, and splitting up the task into sequential steps, where the output of one step is condensed and is used as an input to the next step (with possibly human steering between steps)
amelius•25m ago
Just wait until LLMs are fast and cheap enough to be run in a breadth first search kind of way, with "fuzzy" pruning.
snarky123•1h ago
So wait,GPT found a formula that humans couldn't,then the humans proved it was right? That's either terrifying or the model just got lucky. Probably the latter.
JasonADrury•1h ago
> found a formula that humans couldn't

Couldn't is an immensely high bar in this context, didn't seems more appropriate and renders this whole thing slightly less exciting.

vessenes•46m ago
I'd say "couldn't in 20 hours" might be more defensible. Depends on how many humans though. "couldn't in 20 GPT watt-hours" would give us like 2,000 humans or so.
brcmthrowaway•1h ago
End times approach..
elashri•1h ago
I would be less interested in scattering amplitude of all particle physics concepts as a test case because the scattering amplitudes because it is one of the concisest definition and its solution is straightforward (not easy of course). So once you have a good grasp of the QM and the scattering then it is a matter of applying your knowledge of math to solve the problem. Usually the real problem is to actually define your parameters from your model and define the tree level calculations. Then for LLM to solve these it is impressive but the researchers defined everything and came up with the workflow.

So I would read this (with more information available) with less emphasize on LLM discovering new result. The title is a little bit misleading but actually "derives" being the operative word here so it would be technically correct for people in the field.

longfacehorrace•51m ago
Car manufacturers need to step up their hype game...

New Honda Civic discovered Pacific Ocean!

New F150 discovers Utah Salt Flats!

Sure it took humans engineering and operating our machines, but the car is the real contributor here!

nilkn•49m ago
It would be more accurate to say that humans using GPT-5.2 derived a new result in theoretical physics (or, if you're being generous, humans and GPT-5.2 together derived a new result). The title makes it sound like GPT-5.2 produced a complete or near-complete paper on its own, but what it actually did was take human-derived datapoints, conjecture a generalization, then prove that generalization. Having scanned the paper, this seems to be a significant enough contribution to warrant a legitimate author credit, but I still think the title on its own is an exaggeration.
gaigalas•43m ago
I like the use of the word "derives". However, it gets outshined by "new result" in public eyes.

I expect lots of derivations (new discoveries whose pieces were already in place somewhere, but no one has put them together).

In this case, the human authors did the thinking and also used the LLM, but this could happen without the original human author too (some guy posts some partial on the internet, no one realizes is novel knowledge, gets reused by AI later). It would be tremendously nice if credit was kept in such possible scenarios.

baalimago•41m ago
Well, anyone can derive a new result in anything. Question is most often if the result makes any sense
square_usual•32m ago
It's interesting to me that whenever a new breakthrough in AI use comes up, there's always a flood of people who come in to handwave away why this isn't actually a win for LLMs. Like with the novel solutions GPT 5.2 has been able to find for erdos problems - many users here (even in this very thread!) think they know more about this than Fields medalist Terence Tao, who maintains this list showing that, yes, LLMs have driven these proofs: https://github.com/teorth/erdosproblems/wiki/AI-contribution...
epolanski•22m ago
It's an obvious tension created by the title.

The reality is: "GPT 5.2 after crunching 12 hours mathematical formulas supervised and prompted by 4 experts in the field" which would be nice and interesting per se.

But the title creates a much bigger expectation.

I wouldn't be surprised if you give an LLM some of the thousands of algos we use there and with proper promoting from experts in the field guiding it through the crunching it found a version that works better for bigger or smaller numbers.

lovecg•17m ago
Let’s have some compassion, a lot of people are freaking out about their careers now and defense mechanisms are kicking in. It’s hard for a lot of people to say “actually yeah this thing can do most of my work now, and barrier of entry dropped to the ground”.
loire280•12m ago
It's easy to fall into a negative mindset when there are legions of pointy haired bosses and bandwagoning CEOs who (wrongly) point at breakthroughs like this as justification for AI mandates or layoffs.
hgfda•11m ago
It is not only the the peanut gallery that is skeptical:

https://www.math.columbia.edu/~woit/wordpress/?p=15362

Let's wait a couple of days whether there has been a similar result in the literature.

ares623•31m ago
I guess the important question is, is this enough news to sustain OpenAI long enough for their IPO?
danny_codes•21m ago
Well it’ll be at least a whole month before some other company announces similar capability. The moat will hold!
emp17344•30m ago
Cynically, I wonder if this was released at this time to ward off any criticism from the failure of LLMs to solve the 1stproof problems.
vbarrielle•23m ago
I' m far from being an LLM enthusiast, but this is probably the right use case for this technology: conjectures which are hard to find, but then the proof can be checked with automated theorem provers. Isn't it what AlphaProof does by the way?
mrguyorama•11m ago
Don't lend much credence to a preprint. I'm not insinuating fraud, but plenty of preprints turn out to be "Actually you have a math error here", or are retracted entirely.

Theoretical physics is throwing a lot of stuff at the wall and theory crafting to find anything that might stick a little. Generation might actually be good there, even generation that is "just" recombining existing ideas.

I trust physicists and mathematicians to mostly use tools because they provide benefit, rather than because they are in vogue. I assume they were approached by OpenAI for this, but glad they found a way to benefit from it. Physicists have a lot of experience teasing useful results out of probabilistic and half broken math machines.

If LLMs end up being solely tools for exploring some symbolic math, that's a real benefit. Wish it didn't involve destroying all progress on climate change, platforming truly evil people, destroying our economy, exploiting already disadvantaged artists, destroying OSS communities, enabling yet another order of magnitude increase in spam profitability, destroying the personal computer market, stealing all our data, sucking the oxygen out of investing into real industry, and bold faced lies to all people about how these systems work.

Also, last I checked, MATLAB wasn't a trillion dollar business.

PlatoIsADisease•8m ago
I'll read the article in a second, but let me guess ahead of time: Induction.

Okay read it: Yep Induction. It already had the answer.

Don't get me wrong, I love Induction... but we aren't having any revolutions in understanding with Induction.