Low level, automated theorem providing is going to fall way quicker than most expected, like AlphaGo, precisely because an MCTS++ search over lean proofs is scalable/amendable to self play/relevant to a significant chunk of professional math.
Legit, I almost wish the US and China would sign a Formal Mathematics Profileration Treaty, as a sign of good will between very powerful parties who have much to gain from each other. When your theorem prover is sufficiently better than most Fields medalists alive, you share your arch/algorithms/process with the world. So Mathematics stays in the shared realm of human culture, and it doesn't just happen to belong to DeepMind, OpenAI, or Deepseek.
The workflow I'm envisioning here is the plan document we're all making nowadays isn't being translated directly into code, but into a TLA+/Alloy/... model as executable docs and only then lowered into the code space while conformance is continuously monitored (which is where the toil makes it not worth it most of the time without LLMs). The AI literature search for similar problems and solutions is also obviously helpful during all phases of the sweng process.
I'm curious if there is a scenario in which a large automated proof is achieved but there would be no practical means of getting any understanding of what it means.
I'm an engineer. Think like this: a large complex program that compiles but you don't understand what it does or how to use it. Is such a thing possible?
tzury•4h ago
It relies on a combination of Humans, LLMs ('General Tools'), Domain-Specific Tools, and Deep Research.
It is apparent that the static data encoded within an LLM is not enough; one must re-fetch sources and digest them fresh for the context of the conversation.
In this workflow, AlphaEvolve, Aristotle, and LEAN are the 'PhDs' on the team, while the LLM is the Full Stack Developer that glues them all together.
[1] If one likes pompous terms, this is what 'AGI' will actually look like.
9u127v89yik2j3•1h ago
Literally not AGI.