Well, this definitely won't make the front page despite upvotes. Too many people here are heavily invested and cannot stand for it.
ctrl-F llm
https://i.imgur.com/D5K6b9c.png
Not this story.
edit: Hi fragmede, I'd reply but Nazi @dang says I'm posting too fast again. I guess you won't be working on that fully free chromebook hardware either. I'll keep trying anyway. I'm not motivated by money like you guys. Keep chasing that carrot however.
Eh... I don't know. It's hard for me to believe such absolutist takes. Especially since other proeminent mathematicians (i.e. Tao) have spoken highly of the help and value they get from these systems.
I also see this kind of takes in my field. Especially from "senior" people, 20+ years of experience. The problem is that, when pressed, their "trying it out" is often the most basic, naive, technically atrocious type of testing. "I tried coding with it but it's useless" -> they tried once, 3 years ago, in a chatgpt web interface, and never touched it since.
I think there's a lot of disconnect between the overhypers and deniers. Both are wrong, IMO, and unfortunately both are extremely vocal while also being stubborn and unwilling to try different things. But my gut feeling is that if in 2025/26 someone says "basically zero" they are wrong. Or naive. Or make strawman arguments. And ignorable.
It has still been useful, because parts of the problem have been solved before, and in some cases it can accurately reproduce those parts, which has been massively helpful. And it can help me understand some aspects of the problem. But it can't solve it for me. It has shown that that's simply beyond its capabilities.
[0] https://terrytao.wordpress.com/2025/12/08/the-story-of-erdos...
https://terrytao.wordpress.com/2025/12/09/the-equational-the...
> This was one further addition to a recent sequence of examples where an Erdős problem had been automatically solved in one fashion or another by an AI tool.[Aristotle] Like the previous cases, the proof turned out to not be particularly novel
He concludes:
> One striking feature of this story for me is how important it was to have a diverse set of people, literature, and tools to attack this problem. To be able to state and prove the precise formula for {c(n)} required multiple observations
Which looks like he wants to focus the praise for the solution on the "diverse set of people" relegating any "praise" for LLMs to the obfuscated and ambiguous category of "tools" behind literature.
I am also fairly certain you have yet to actually watch Tao "demonstrate" his interaction with these technologies because if you had you would know in those demonstrations he formally verifies theorems that already have proofs, and even more importantly, that he had already formally verified twice before 'off camera'. Despite these two realities he still spends a lot of time correcting the LLMs or completely ignoring their suggestions while expressing a mostly polite appraisal of the tech's use and capability.
Of course the hype machine takes his politeness, which has more to do with his specific personality than anything to do with the tech, as universal endorsement for the tech and all of its externalities.
I called it when he started getting in bed with these malevolent tech companies that they were laundering their misdeeds through his reputation, and here we are.
Math seems like an ideal target for AI, as it provides a firm basis for identifying correctness (has something actually been proved.) I think the consequences of this are going to be enormous for mathematics, including the conversion of the entire historical math literature to formalized, checked proofs. Once that is done, once all that training data is available, the capabilities of math AIs are IMO likely to be extraordinary. Add to that the sort of Alpha Go-style self training as AI provers work on new problems.
I lack the personal relationship with Tao necessary to determine if his words are truthful or otherwise (namely for me, paid for), but you also failed to provide "his statements" that say either "he's getting objective value out of AI tools" (however one may quantify such a statement) or that "they are helping him accelerate actual math research" choosing instead to simply claim those words as Tao's own.
All of the future tense coded "I think", "going to be", "Once that is done", and "IMO likely" in your second paragraph seem to be at conflict, in the very least tensewise, with your gp comment's past tense of "Not only said, but demonstrated."
1. Don't want to use them where they suck.
Think normalisation of deviance: "the problems haven't affected me therefore they don't exist" is a way to get really badly burned.
2. Want to train up in things they will still suck at by the time I've leared whatever it is.
I find LLMs seem kinda bad at writing sheet music, and Suno is kinda bad at werid instructions (like Stable Diffusion for images), but I expect them to get good before I can.
I also find them inconsistent at non-Euclidian problems: sometimes they can, sometimes they can't. I have absolutely no idea how to monetise that, but even if I could, "inconsistent" is itself an improvement on "cannot ever" which is what SOTA was a few years ago.
The LLM is not done until the theorem-prover says it is done - at which point we can be sure of the correctness of the assertion in the context. AFAIK there is no debate about inaccuracy with theorem-provers. Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics.
Your "in the context" is doing a lot of heavy lifting here, see:
- The Math Is Haunted : https://news.ycombinator.com/item?id=44739315 : https://overreacted.io/the-math-is-haunted/
- Three ways formally verified code can go wrong in practice : https://news.ycombinator.com/item?id=45555727 : https://buttondown.com/hillelwayne/archive/three-ways-formal...
- Breaking “provably correct” Leftpad : https://news.ycombinator.com/item?id=45492274 : https://lukeplant.me.uk/blog/posts/breaking-provably-correct...
- what does a lean proof prove? : https://news.ycombinator.com/item?id=46286605 : https://supaiku.com/what-does-a-lean-proof-prove
> Their real limitation continues to be converging in realistic timescales and efficient search over all combinations of proof tactics.
I agree "efficient search over all combination of proof tactics" will be a defining moment in the field. I disagree these LLMs are it.
j16sdiz•1d ago
I would assume pairing LLMs with a formal proof system would help a lot. At the very least, the system can know what is incorrect, without lengthy debates, which frustrate him most.
This won't help the system discover or solve new math, but it make the experience far more endurable.
112233•1d ago