Ongoing Lean formalisation of the proof of Fermat's Last Theorem

https://github.com/ImperialCollegeLondon/FLT

53•anonyonoor•2d ago

https://github.com/ImperialCollegeLondon/FLT/blob/main/GENER...

Comments

amelius•4h ago

Since the proof already exists in human-written form, I'm wondering, can't OpenAI's IOM gold winning algorithm not translate the blueprint to lean?

tripplyons•4h ago

You're either underestimating the length of the proof, or overestimating the length of tasks that models can currently accomplish.

amelius•4h ago

The blueprint is a step-by-step outline.

tripplyons•4h ago

If the goal is to formalize the proof, you would need more than an outline.

rcxdude•4h ago

This is a significantly harder problem than winning gold in IOM. A large part of it is figuring out how to represent some of the relevant ideas in Lean at all.

Davidzheng•50m ago

I disagree. I think it's a sequence of a huge number of modular moderately hard tasks each much easier than a hard IMO question.

seanwilson•4h ago

"The International Mathematical Olympiad (IMO) is the World Championship Mathematics Competition for High School students", so not to undermine it but it's below university or graduate level.

Research level mathematics like this is as hard as it gets, and this proof is famously difficult: uses many branches of advanced mathematics, required thousands of pages of proofs, years of work.

amelius•4h ago

Yes but the hard work (coming up with a human-readable proof) has already been done.

rcxdude•4h ago

No, some of the harder work has been done. Translating human-readable proofs into machine-readable ones is also very hard work and an area of active research.

seanwilson•3h ago

Human readable (informal) proofs are full of gaps that all have to be traced back to axioms e.g. gaps that rely on shared intuition, background knowledge and other informal proofs.

It's somewhat like taking rough pseudo code (the informal proof, a mixture of maths and English) and translating that into a bullet-proof production app (the formal proof, in Lean), where you're going to have to specify every step precisely traced back to axioms, handle all the edge causes, fix incorrect assumptions, and fill in the missing parts that were assumed to be straightforward but might not be.

A major part is you also have to formalise all the proofs your informal proof relied on so everything is traced back to the initial axioms e.g. you can't just cite Pythagorus theorem, you have to formalise that too.

So it's an order of magnitude more difficult to write a formal proof compared to an informal one, and even when you have the informal proof it can teams many years of effort.

bee_rider•3h ago

I’m almost certain this is ignorance on my part, but it seems like this would mean the proof is… possibly wrong? I mean if there are gaps and other informal proofs in there?

But I thought it was a widely celebrated result.

seanwilson•3h ago

> I’m almost certain this is ignorance on my part, but it seems like this would mean the proof is… possibly wrong?

That's part of the motivation to formalise it. When a proof gets really long and complex, relies on lots of other complex proofs, and there's barely any single expert who has enough knowledge to understand all the branches of maths it covers, there's more chance there's a mistake.

There's a few examples here of errors/gaps/flaws found while formalizing proofs that were patched up:

https://mathoverflow.net/questions/291158/proofs-shown-to-be...

My understanding is it's common to find a few problems that then need to be patched or worked around. It's a little like wanting to know if a huge codebase is bug-free or not: you might find some bugs if you formalized the code, but you can probably fix the bugs during the process because it's generally correct. There can be cases where it's not fixable though.

6gvONxR4sf7o•3h ago

The level of rigor used in math if sometimes characterized as "sufficient to convince other mathematicians of correctness." So, yeah possibly, but not in a willy nilly way. It's not a proof sketch, it's a proof. It just isn't written in human language designed for communication.

williamstein•3h ago

Kevin Buzzard told me that the worry that it might in fact be wrong is a huge motivator for him. I also once asked Serge Lang why so much mathematics is correct (which surprised me coming from programming where everything has bugs), and he said; “people do a large number of consistency checks beyond what is in the published proofs, which makes the chances the claimed results are correct much, much higher.” Another related quote Bryan Birch told me once: “it is always a good idea to prove true theorems.”

paulddraper•2h ago

Yes and when it was first published it was wrong (made leap of logic).

It takes thorough review by advanced mathematicians to verify correctness.

This is not unlike a code review.

Most people vastly underestimate how complex and esoteric modern research mathematics are.

voxl•18m ago

Proof assistant code is high reliability, there is no room to fudge it. This is perhaps the one place where you can really see how bad LLMs are when you care about reliability.

ethan_smith•6m ago

Formalizing Wiles' proof requires translating hundreds of pages of sophisticated mathematics with implicit reasoning steps into a precise logical framework, which is fundamentally different from the pattern-matching AI uses to solve competition problems.

seanwilson•4h ago

This is a nice overview of what this is, why they're doing it and why it's many years of work:

https://github.com/ImperialCollegeLondon/FLT/blob/main/GENER...

dang•2h ago

Link added to top text. Thanks!

mensetmanusman•4h ago

Are there any graphics that show the massive progress to date in some symbolic form?

ants_everywhere•2h ago

I love that they want to formalize this proof, and I understand why they're using Lean.

But part of me feels like if they are going to spend the massive effort to formalize Fermat's Last Theorem it would be better to use a language where quotient types aren't kind of a hack.

Lean introduces an extra axiom as a kind of cheat code to make quotients work. That makes it nicer from a softer dev perspective but IMO less nice from a mathematical perspective.

cmrx64•47m ago

What’s mathematically questionable about the quotient soundness axiom? It’s justifiable metamathematically. What’s the real difference baking it into the proof kernel? I’d rather such independent properties be modeled as an axiom. The quotient automation I’m familiar with in other theorem provers is typically way more (untrusted!) machinery than just stating quot.sound.

YossarianFrPrez•2h ago

Side note: The organization that maintains Lean is a "Focused Research Organization", which is a new model for running a science/discovery based nonprofit. This might be useful knowledge for founder types who are interested in research. For more information, see: https://www.convergentresearch.org

And if you want to read why we need additional types of science organizations, see "A Vision of Metascience" (https://scienceplusplus.org/metascience/)

pierrefermat1•1h ago

The concept trying new science orgs is noble, but this is the typical Schmidt BS of saying every previous academic consortia is totally incompetent and I'm the only one that can inject the magic sauce of focus and coordination.

mlyle•33m ago

To me, it seems like coming up with something more coordinated than a consortium and more flexible than a single lab or a research corporation funded by multiple universities makes sense.

It's probably a narrow set of problems with the right set of constraints and scale for this to be a win.

C++26 Reflections adventures and compile-time UML

Helsinki records zero traffic deaths for full year

Telo MT1

Micron rolls out 276-layer SSD trio for speed, scale, and stability

6 weeks of Claude Code

Writing a basic service for GNU Guix

HTML-in-Canvas

28th International Obfuscated C Code Contest

Lina Khan points to Figma IPO as vindication of M&A scrutiny

Remote hosting for your telescope

The Crisis of Professional Skepticism

Anandtech.com now redirects to its forums

Ongoing Lean formalisation of the proof of Fermat's Last Theorem

Super-resolution of Sentinel-2 images (10M –> 5M)

We may not like what we become if A.I. solves loneliness

PixiEditor 2.0 – A FOSS universal 2D graphics editor

Online Collection of Keygen Music

LangExtract: Python library for extracting structured data from language models

Mezzano, an operating system written in Common Lisp

The Art of Multiprocessor Programming 2nd Edition Book Club

At a Loss for Words: A flawed idea is teaching kids to be poor readers (2019)

Browser extension and local backend that automatically archives YouTube videos

Parsing without ASTs and Optimizing with Sea of Nodes [video]

A.I. researchers are negotiating $250M pay packages

Neural networks that learn non-linearity without activation functions [pdf]

The /o in Ruby regex stands for "oh the humanity "

Compressing Icelandic name declension patterns into a 3.27 kB trie

The Big Oops in type systems: This problem extends to FP as well

Great Question (YC W21) Is Hiring a VP of Engineering (Remote)

Show HN: NaturalCron – Human-Readable Scheduling for .NET (With Fluent Builder)

Ongoing Lean formalisation of the proof of Fermat's Last Theorem

Comments

C++26 Reflections adventures and compile-time UML

Helsinki records zero traffic deaths for full year

Telo MT1

Micron rolls out 276-layer SSD trio for speed, scale, and stability

6 weeks of Claude Code

Writing a basic service for GNU Guix

HTML-in-Canvas

28th International Obfuscated C Code Contest

Lina Khan points to Figma IPO as vindication of M&A scrutiny

Remote hosting for your telescope

The Crisis of Professional Skepticism

Anandtech.com now redirects to its forums

Ongoing Lean formalisation of the proof of Fermat's Last Theorem

Super-resolution of Sentinel-2 images (10M –> 5M)

We may not like what we become if A.I. solves loneliness

PixiEditor 2.0 – A FOSS universal 2D graphics editor

Online Collection of Keygen Music

LangExtract: Python library for extracting structured data from language models

Mezzano, an operating system written in Common Lisp

The Art of Multiprocessor Programming 2nd Edition Book Club

At a Loss for Words: A flawed idea is teaching kids to be poor readers (2019)

Browser extension and local backend that automatically archives YouTube videos

Parsing without ASTs and Optimizing with Sea of Nodes [video]

A.I. researchers are negotiating $250M pay packages

Neural networks that learn non-linearity without activation functions [pdf]

The /o in Ruby regex stands for "oh the humanity "

Compressing Icelandic name declension patterns into a 3.27 kB trie

The Big Oops in type systems: This problem extends to FP as well

Great Question (YC W21) Is Hiring a VP of Engineering (Remote)

Show HN: NaturalCron – Human-Readable Scheduling for .NET (With Fluent Builder)