The Math Is Haunted

https://overreacted.io/the-math-is-haunted/

358•danabramov•1d ago

Comments

7373737373•1d ago

One problem I have with learning Lean is that tactics - like rfl in the example - are overloaded, and their full semantics not completely explained/understandable from the tutorials. Unlike, say, C programming where one may understand what happens to the program state down to the bit, it feels too fuzzy. And the rewrite (rw) tactic syntax doesn't feel natural either.

danabramov•1d ago

Yeah the documentation is also quite fragmented because tactics are user-definable and some are from Lean vs from Mathlib etc. I've gotten quite decent at using the basic ones but I still sometimes ask on Zulip if something isn't working as I expected.

LegionMammal978•1d ago

Yeah, I've similarly found the tactics in Coq (now Rocq) difficult to internalize. E.g., I might have "A = B" and "P(A,A)" available, and I want to conclude "P(A,B)", but the rewrite will fail for some arcane reason. (Issues with the definition of some of the intermediate structures, I'd imagine.)

On the other end of the spectrum, I've recently been playing with Metamath and its set.mm database, which has no programmable tactics at all, only concrete inferences that can be used in a proof. (E.g., the modus ponens inference ax-mp says that "|- ph" and "|- ( ph -> ps )" prove "|- ps", where "ph" and "ps" are variables that can be substituted.) Alas, it's not much better, since now you have to memorize all the utility lemmas you might need!

7373737373•23h ago

Agreed - the Metamath base language (and its verifier) seem to be the most tractable of all I've seen, although it is probably still quite far away from the complexity of the high level language(s) that compile to it.

Derived from it, the currently best attempt to achieve an unambiguous and secure language seems to be Metamath Zero: https://github.com/digama0/mm0

solomonb•23h ago

This is why I prefer Agda, where everything comes down to pattern matching.

agnishom•20h ago

You can absolutely use pattern matching in Lean instead of tactics, if you prefer to write the proof that is closer to "what is going on under the hood"

solomonb•5h ago

yeah i didn't mean to imply you cannot do that, but tactics seem to be highly encouraged.

I'm actually a big fan of Lean, I just like it more as a programming language for writing programs with dependent types then as a proof checker.

gylterud•23h ago

This is one of the reasons I prefer Agda. It is usually written without tactics, you just write the proof term in a functional programming language via the Curry–Howard correspondence. The trade off is that you must be more disciplined with creating useful abstractions and functions, otherwise proving even routine stuff becomes really tedious.

Paracompact•16h ago

I'm not sure you prefer Agda so much as you prefer providing proof terms functionally rather than imperatively. Then again I've never used Agda; how does it differ from Coq/Rocq minus Ltac or Lean minus tactics?

logicchains•12h ago

Coq has much less support for working with dependent types; you need lots of annotations compared to Agda (which has superior dependent pattern matching support). Lean is somewhere between the two.

kmill•22h ago

At least you can 'go to definition' on the tactics and see what they're doing. It's a lot to take in at the beginning, but it can all be inspected and understood. (At least until you get to the fundamental type theory; the reduction rules are a lot harder to get into.)

> the rewrite (rw) tactic syntax doesn't feel natural either.

Do you have any thoughts on what a natural rewrite syntax would be?

7373737373•22h ago

> Do you have any thoughts on what a natural rewrite syntax would be?

Not yet, but I'd probably prefer something that more explicitly indicated (in English, or some sort of more visually "pointing" indicator) which specific parts of the previous step would be replaced

It feels weird, or I'd have to get used to that both what is being replaced and what it is replaced with depends on some distant context, it's very indirect as it requires switching attention between the tactics tree and the context or previous proofs

danabramov•20h ago

If you have specific ideas, I'm also curious! I wonder if Lean can be extended to support what you're thinking of — its ability to have custom syntax and tactics seems really powerful. That's part of what excites me about Lean as I'm also not always the biggest fan of existing tactics, but they seem to evolve similarly to other pieces of software.

kmill•5h ago

We're working on a new rewrite tactic this summer at the Lean FRO (I don't know if I ever directly mentioned that to you yet on Zulip).

One interface I'm planning on is `rw [(pos := 1,3,2) thm]` to be able to navigate to the place where the rewrite should occur, with a widget interface to add these position strings by clicking on them in the Infoview. The whole occurrences interface will also be revamped, and maybe someone from the community could help make a widget interface for that too.

Of course there's already `conv => enter [1,3,2]; rw thm`, but putting it directly into `rw` is more convenient while also being more powerful (`conv` has intrinsic limitations for which positions it can access).

The interface is what people will notice, but technical idea of the project is to separate the "what" you want to rewrite from "how" to get dependent type theory to accept it, and then make the "how" backend really good at getting rewrites to go through. No more "motive not type correct", but either success or an error message that gives precise explanations of what went wrong with the rewrite.

And, yeah, it's great that Lean lets you write your own tactics, since it lets people write domain-specific languages just for solving the sorts of things they run into themselves. There's no real difference between writing tactics as a user or as part of the Lean system itself. Anyone could make a new rewrite tactic as a user package.

Tainnor•3h ago

> No more "motive not type correct"

That's exciting to hear!

swagmoney1606•21h ago

You may like Agda. I prefer Lean even though you are right about this.

emmelaich•18h ago

It was interesting to me as it didn't fit my expectations. As a math theory ignoramus, I expected that reflection and rewrite are more fundamental than addition. But Lean seems to assume addition but require explicit rfl and rewrite.

Perhaps there's a Lean "prelude" that does it for you.

derdi•10h ago

Yes, there is a prelude that defines natural numbers and an addition function on them. As the post notes, the reflexivity tactic "unfolds" the addition, meaning that it applies the addition function to the constants it is given, to arrive at a constant. This is not specific to addition, it unfolds other function definitions too. So addition doesn't have to come first, you are right that reflexivity is more fundamental.

danabramov•8h ago

Author here — this is correct. I’ve added a paragraph on this in an edit btw so it’s possible the parent poster hasn’t seen it yet. Reproducing the paragraph:

(Here, rfl closes ⊢ 3 + 3 = 6, but for a different reason than one might think. It doesn’t really “know” that 3 + 3 is 6. Rather, rfl unfolds the definitions on both sides before comparing them. As 3, 6, and + get unfolded, both sides turn into something like Nat.zero.succ.succ.succ.succ.succ.succ. That’s why it actually is a something = something situation, and rfl is able to close it.)

Also, another resource on this is https://xenaproject.wordpress.com/2019/05/21/equality-part-1...

emmelaich•7h ago

Thanks! That makes sense.

armchairhacker•8h ago

It's because the addition can evaluate to only one form: `3 + 3` and `6` both evaluate to `succ (succ (succ (succ (succ (succ zero)))))`, similarly Lean can infer `4 * 3 = 1 + 3 + 8` because both evaluate to the same as `12`. But an expression can be rewritten to an infinite number of forms, e.g. `x * 2` can be rewritten to `x + x`, `(x + (x / 2)) * 4/3)`, etc. So `refl` can automatically evaluate both sides but not rewrite them.

gowld•7h ago

One thing Lean inherits from mathematics is the use of opaque notation. It would be nice to inherit some readability from programming.

Paperproof is an effort in one direction (visualization of the logic tree)

https://paperproof.brick.do/lean-coq-isabel-and-their-proof-...

daxfohl•7h ago

The surprising thing to me was that tactics are all written in "user-level" code, outside the proof kernel. This makes sense in the sense that you want a small, thoroughly tested kernel that doesn't change. But it also implies that if you use tactics in your proof, then your proof can go from correct to failing if one of the tactics you use gets modified between releases. Is that a problem in real world use?

Tainnor•7h ago

Yes. I've seen proofs fail after a mathlib update because the behaviour of simp changed. I've never seen it do less after an update (so far), but sometimes it'll simplify more and then the next steps may fail to work.

I've since adopted the suggestion (that is afaik used in mathlib) to never use bare simp unless it closes the current goal directly. Instead, if you write simp?, Lean will run the simplifier and tell you exactly which theorems it used (in the form of simp only [...]) which you can then insert into the proof instead.

daxfohl•1h ago

I guess it makes sense: the tactics are defined in the lean repo (but not in the kernel of that repo), while mathlib has its own repo. But any changes to tactic code triggers a test to verify that mathlib still works against it. Which implies there's a reasonable set of checks and balances that don't allow tactic changes to break anything too badly if you follow mathlib's idioms, though it can still happen.

Tainnor•7h ago

I understand what you mean, but I shudder to think at how complicated it would be to write certain proofs without tactics, e.g. simp, ring, omega, linarith and friends all really do a lot of work.

kevinqi•1d ago

as someone who hasn't seen Lean before but was curious from alphaproof, love the intro! curious if you can mention what you're working on in Lean?

danabramov•1d ago

For now I'm just learning math with it!

Currently I'm going through https://github.com/teorth/analysis (Tao's Lean companion to his textbook) and filling in the `sorry`s in the exercises (my solutions are in https://github.com/gaearon/analysis-solutions).

kevinqi•1d ago

very cool. btw, I also love that "sorry" is the "any" equivalent in Lean

7373737373•1d ago

Does Lean have some sort of verification mode for untrusted proofs that guarantees that a given proof certainly does not use any "sorry" (however indirectly), and does not add to the "proving power" of some separately given fixed set of axioms with further axioms or definitions?

jonny_eh•1d ago

Apparently this is possible with macros? I dunno: https://github.com/leanprover/lean3/issues/1355

kmill•22h ago

That's Lean 3, from eight years ago, and it's from before 'sorry' really existed in the way we know it now.

---

To answer the GP's question: Not only is there a verification mode, but Lean generates object files with the fully elaborated definitions and theorems. These can be rechecked by the kernel, or by external verifiers. There's no need to trust the Lean system itself, except to make sure that the theorem statements actually correspond to what we think they're supposed to be.

treyd•22h ago

What exactly do these object files look like?

kmill•22h ago

The "olean" files are a binary format that contain everything that was added to the Lean environment. Among other things, it includes all of the declarations and their Lean.Expr [1] expressions. People have written tools to dump the data for inspection [2], or to independently check that the expressions are type correct and the environment is well-formed (that is, check the correctness).

[1] https://github.com/leanprover/lean4/blob/3a3c816a27c0bd45471... [2] https://github.com/digama0/oleandump [3] https://github.com/ammkrn/nanoda_lib

masterjack•1d ago

Yes, you can `print axioms` to make sure no axioms were added, make sure it compiles with no warnings or errors. There’s also a SafeVerify utility that checks more thoroughly and catches some tricks that RL systems have found

danabramov•1d ago

Does `#print axioms some_theorem` mentioned at the end of the article qualify? This would show if it depends on `sorry`, even transitively, or on some axioms you haven't vetted.

7373737373•1d ago

Oh, I missed that, thanks! It would be cool to use this to visualize the current state and progress on, and depth of the "proof dependency graph"!

cubefox•23h ago

One thing I didn't know until recently was that Lean doesn't solve the problem of verifying that a formal theorem actually states what we think it states rather than something else.

In some cases that's not an issue, because the formal statement of the theorem is very simple. E.g. for Fermat's Last Theorem, which can be formally expressed with "(x y z : ℕ+) (n : ℕ) (hn : n > 2) : x^n + y^n ≠ z^n". But in other cases it might be much harder to verify that a formal statement actually matches the intuitive informal statement we want.

Of course, Lean or similar languages still offer the huge potential of verifying that a formal statement is provable, even if they don't help with verifying that the formal statement matches the intended informal statement in natural language.

danabramov•23h ago

Are there, in general, ways to do that? Sorry if this sounds like a troll question, but I struggle to think of how one would verify that a formal statement matches the accepted understanding other than painstakingly going through the definitions with experts from both sides.

cubefox•22h ago

I don't think there is any technical solution for that, apart perhaps from asking an LLM, or multiple different LLMs, whether it thinks the formal definition matches the informal statement.

When mathematicians check normal (informal) proofs they usually verify both things at once: that the proof doesn't contain mistakes or insufficiencies, and that it proves what the author says it proves. Formal proof checkers aren't able to do the latter.

danabramov•20h ago

I personally view proof checkers as mathematicians' tools so I assume mathematicians would be involved either way. With some percentage actually preferring to work closely with these tools. See also Terence Tao's comment in https://terrytao.wordpress.com/2025/05/31/a-lean-companion-t... which feels relevant to me

cubefox•19h ago

Tao misunderstands the question here: it was about reconciling a traditional (informal) proof and a formal proof which come to opposite conclusions, not about two different formal proofs.

danabramov•18h ago

I think he accounts for that in the answer, "in a way that would be faster than if one or both proofs were informal" (which assumes "one formal and one informal" is also a case he's talking about). The way I understand his point is that in either case you would have to go through the two mathematical structures with the same amount of rigour and attention until you find the divergence, and that's easier to do when at least one side is formal (but can be done in either case).

In other words, informal math doesn't make this problem easier because you can still make and miss mistakes in encoding intent into the structure. But at least with formal math, there's whole classes of mistakes that you can't make.

tunesmith•22h ago

Yeah this seems like the specification/implementation problem. One can perfectly implement a bad spec, but coming up with the perfect spec is a human problem.

adastra22•17h ago

Yeah, by adopting Lojban as our common language.

Joking of course, but only because Lojban doesn’t have formally specified semantics.

Computational semantics could, in principle, allow you to enumerate the possible logical meanings of a given informal English phrasing of the statement, and these can be either each proven or thrown out with explicit justification for why that was not what you meant.

Paracompact•15h ago

Nope! Not a troll question. At the end of the day, we're human beings interacting with a bunch of electrified rocks. There will always be a layer of subjectivity and expertise involved in translating informal (or perhaps even unknown) desires into formal specs. It gets even worse when you are designing software that needs to take into account the underlying behavior of hardware; you wouldn't program an autopilot for a jumbo jet whose certification only exists in the context of some simplified formal model for how computers work, do you?

But I'm catastrophizing. If you think going through the definitions of a spec are painstaking and require expert insight to get right, it is miniscule compared to e.g. auditing every line of a codebase written in an imperative language. And not only that, to even get close to the same correctness guarantees of e.g. a formally verified pipeline through CompCert, you would have to audit every line of the codebases of every compiler that touches your code (https://r6.ca/blog/20200929T023701Z.html "It Is Never a Compiler Bug Until It Is"), together with analysis of the underlying hardware models.

All this to say, technology is more complicated than traditional programmers exposed only to high-level languages might be led to believe. That complication can only be reduced, consolidated, or blackboxed so much.

danabramov•8h ago

Yeah Lean is actually pretty interesting in that sense because it’s designed to have a small kernel that actually does the type theory checking, and that kernel has a specification, tests, and independent reimplementations. The kernel really is very small compared to the entirety of Lean syntax and behavior — everything else runs and is elaborated before stuff feeds into the kernel. So the surface area for actual proof checking bugs is greatly reduced.

Paracompact•3h ago

Yeah, the goal for formal methods boils down to only two things: reduce surface area necessary for auditing (e.g. high-level Lean theorem and definitions file), and consolidate that surface area into preexisting, pre-audited technologies (e.g. the Lean kernel).

Almondsetat•14h ago

>Lean doesn't solve the problem of verifying that a formal theorem actually states what we think it states rather than something else.

With all honesty, if a tool existed that did that we would have solved maths. Or at least we would have solved human communication. In both cases it would be the most studied thing in the world, so I don't know where this belief would come from

brookst•22h ago

I’ve been kicking around the idea of something like Lean (maybe even just Lean?) to rewrite news and other non-fiction articles, treating statements as theorems that need to be proven. Proofs could include citations, and could be compound things like “this is a fact if three of my approved sources asserted it as a fact”

It should then be possible to get a marked-up version of any document with highlighting for “proven” claims.

Sure, it’s not perfect, but I think it would be an interesting way to apply the rigor that used to be the job of publications.

ants_everywhere•22h ago

I don't know why you're getting downvoted because I think this is an interesting idea.

Completely impossible but still interesting and fun to explore.

But you don't need the power of Lean to do something like this. I would recommend starting with something like Prolog or RDF triples

ramses0•21h ago

Basically [Citation Needed]

magicalhippo•17h ago

> I’ve been kicking around the idea of something like Lean (maybe even just Lean?) to rewrite news and other non-fiction articles, treating statements as theorems that need to be proven.

I found I got much better at writing non-fiction after having math at uni. I would help proof-read essays and other hand-ins by my SO and sister, and apply similar rigor as you mention. Stuff like "you show C follows from B here, but you haven't actually given an argument for why B follows A, so you can't then claim C follows from A".

It's tempting to say that with LLMs this seems like a plausible task to turn it into a program, but the hallucination issue puts a damper on that scheme.

refulgentis•16h ago

A mild damper at best, RAG-based pipelines are mature now.

Alas, things like this aren't logic proofs.

It bothers me to my core when I see this idea. Sometimes at FAANG. Blissfully management learned to...not promote...them.

petesergeant•16h ago

> A mild damper at best, RAG-based pipelines are mature now.

I work with RAG pipelines all day, and the idea that hallucination isn't an ongoing major issue doesn't match my experience _at all_. Possibly a skill issue on my part, but, also on the part of everyone I ever talk to in the same space too.

TheOtherHobbes•14h ago

There are rhetorical tricks which rely on this to be persuasive. You can say "Thing X is happening, so we should do Thing Y", and people will nod.

If you're sneaky about it it will read like a logical conclusion when in fact X and Y are only loosely related contextually, and there is no logical chain at all.

A standard political trick is to blame X on something emotive and irrelevant, and offer Y as a false solution which distracts from the real causes of the problem.

This is used so often it's become a core driver of policy across multiple domains.

Although it's very effective, it's a crude way to use this. There are more subtle ways - like using X to insinuate criticism of a target when Y is already self-evident.

Point being, a lot of persuasive non-fiction, especially in politics, law, religion, and marketing, uses tricks like these. And many others.

They work because they work in the domain of narrative logic - persuading through stories and parables with embedded emotional triggers and credible-sounding but fake explanations, where the bar of "That sounds plausible" is very low.

LLMs already know some of this. You can ask ChatGPT to make any text more persuasive, and it will give you some ideas. You can also ask it to read a text, pull out the rhetorical tricks, and find the logical flaws.

It won't do as good a job as someone who uses rhetoric for a living. But it will do a far better job than the average reader, who is completely unaware of rhetoric.

bjackman•17h ago

I think it's quite rare that the most important beliefs you should derive from the news can be justified by a collection of absolute statements.

I think you'd be better served by a tool for calculating chains of Bayesian reasoning. I've seen a tool that does this for numerical estimations.

lblume•11h ago

Correct. The only way to not subject oneself to (economic) irrationality is to model your beliefs about the world using probabilities and using Bayes to update in light of new evidence.

bubblyworld•16h ago

Formalising natural language statements is a minefield of difficulties, for (imo) essentially the same reasons that writing code which interacts with the real world is so difficult. Concepts you take for granted like identity, time, causality... all of that stuff needs to be fleshed out carefully and precisely in the formalism for facts to be relatable to each other (or even expressible in the first place).

Not to discourage you - it's a cool problem! OpenCog comes to mind as a project that tried to take this all the way, and there's a field devoted to this stuff in academia called KRR (knowledge representation and reasoning). The IJCAI journal is full of research on similar topics.

(also see all the different logics philosophers use to formalise different kinds of argument involving time/modality/probability - there are a lot, but they aren't very "modular" and can't easily be mixed unless someone has solved that since I last looked into it)

eru•15h ago

> Concepts you take for granted like identity, time, causality... all of that stuff needs to be fleshed out carefully and precisely in the formalism for facts to be relatable to each other (or even expressible in the first place).

Yes, and different formalisations of identity apply in different contexts.

Eg remember the famous line about not being able to step in the same river twice.

photonthug•14h ago

> logics philosophers use .. aren't very "modular" and can't easily be mixed

Not sure if the model-checking communities would agree with you there. For example CTL-star [0] mixes tree-logic and linear-temporal, then PCTL adds probability on top. Knowledge, belief, and strategy-logics are also mixed pretty freely in at least some model checkers. Using mixed combinations of different-flavored logic does seem to be going OK in practice, but I guess this works best when those diverse logics can all be reduced towards the same primitive data structures that you want to actually crunch (like binary decision diagrams, or whatever).

If no primitive/fast/generic structure can really be shared between logics, then you may be stuck with some irreconcilable continuous-vs-discrete or deterministic-vs-probabilistic disconnect, and then require multiple model-checkers for different pieces of one problem. So even if mixing different flavors of logics is already routine.. there's lots of improvements to hope for if practically everything can be directly represented in one place like lean. Just like mathematicians don't worry much about switching back and forth from geometry/algebra, less friction between representations would be great.

Speaking of CTL, shout out to Emerson[1], who won a Turing award. If he hadn't died recently, I think he'd be surprised to hear anyone suggest he was a philosopher instead of a computer scientist ;)

[0]: https://en.wikipedia.org/wiki/CTL* [1]: https://en.wikipedia.org/wiki/E._Allen_Emerson

bubblyworld•12h ago

Yeah, not suggesting philosophers are the only people using logics, but they've certainly been using them the longest!

Indeed, I've seen various attempts to tackle the problem including what you are suggesting - expressing the semantics of different logics in some base formalism like FOL in such a way that they can interplay with each other. In my experience the issue is that it's not always clear how two "sublogics" should interact, and in most cases people just pick some reasonable choice of semantics depending on the situation you are trying to model. So you end up with the same issue of having to construct a new logic for every novel situation you encounter, if that makes sense?

Logics for computing are a good example - generally you use them to formalise and prove properties of a program or spec, so they are heavily geared towards expressing stuff like liveness, consistency invariants and termination properties.

I haven't read about CTL though, thanks! I'll check it out. Hopefully I didn't write too much nonsense here =)

> Just like mathematicians don't worry much about switching back and forth from geometry/algebra [...]

As an ex-mathematician I think we worry a lot about transitioning between viewpoints like that. Some of the most interesting modern work on foundations is about finding the right language for unifying them - have a look at Schulze's work on condensed mathematics, for example, or basically all of Grothendieck's algebraic geometry work. It's super deep stuff.

> then you may be stuck with some irreconcilable continuous-vs-discrete or deterministic-vs-probabilistic disconnect

Agreed, I think this is one of the cruxes, and lately I'm starting to feel that maybe strict formal systems aren't the way to go for general-purpose modelling. Perhaps we need to take some inspiration from nature - completely non-deterministic, very messy, and nevertheless capable of reasoning about the universe around it!

zozbot234•12h ago

> (also see all the different logics philosophers use to formalise different kinds of argument involving time/modality/probability - there are a lot, but they aren't very "modular" and can't easily be mixed unless someone has solved that since I last looked into it)

From a logic-as-types perspective, modalities turn out to be monads. So the problem of "mixing" modalities is quite similar to the problem of composing monads in programming languages.

wredcoll•14h ago

I think a more practical method would be to trace the provenance of a claim, in a way that lets you evaluate how likely it is for the person making it to actually know the facts.

A simple scenario would be reading a news article about american politics. You see an unusual claim made, so you start tracing it and find that the journalist got it from person X who got it from person Y who got it from donald trump. Trump is famous for lying constantly, so unless there's a second source making the claim, you could disregard it with a high probability.

CobrastanJorji•14h ago

I think it'd be more interesting to map out entire trees of arguments about a topic. Start with something big, like "is there a God," then come up with all of the arguments for or against, then come up with all of the arguments against those arguments, then the counters to those arguments. Really explore it as a sort of debate space. Then bring in citations not as backing but for historical context: "Plato made this argument in such and such." The idea wouldn't be so much to decide a winner as to prevent us from going around in circles by making a map.

vntok•12h ago

Kialo does this for online debates: https://www.kialo.com/

An example tree for the statement "God exists": https://www.kialo.com/god-exists-3491?path=3491.0~3491.1

Scarblac•11h ago

> Start with something big, like "is there a God,"

To do that you first need a definition of the concept God.

And then you realize that all the people making arguments in the past were using their own unspoken and incompatible definitions.

atomicnature•13h ago

Proof != evidence. In evidence, we corroborate, collate, add more sources, weigh evidence, judge. Proof is a totally different process. Only in the mathematical do one prove something, everywhere else we build up evidence, corroborate, etc.

jeffhuys•13h ago

Look at Arguman, now defunct, but you can probably find some good videos/screenshots.

Basically crowd-sourced statements, rebuttals, agreements, etc, in a nice interface.

We need that nowadays.

vntok•12h ago

There's Kialo: https://www.kialo.com/

atoav•13h ago

As a programmer who has studied philosophy: This is the approach I take more or less in my head when reading articles, however the problem is the ambiguity of natural language.

E.g. lets say the statement is about a presidents promise to end a conflict within 24 hours after coming into office. One could get to a conclusion pretty quickly when the conflict hasn't been ended after 24 hours when they entered the office.

But what does "end the conflict" mean exactly? If the conflict ended how long does it need to remain ended to achieve the label "ended"? What if said president has a history that recontextualizes the meaning of that seemingly simple claim because he is known to define the word "ended" a little different than the rest of us, do you now judge by his or by our definition? What if the conflict is ended but there is a small nest of conflict remaining, after which size do we consider the conflict going on?

I know some of that has official definitions, but not everything has. In the end a lot of that will require interpretation and which definition to chose. But yeah I support your idea, just spelling it out and having a machine-readable chain of thought might help already.

fancy_pantser•13h ago

Claimify from MS research aims in this direction. There's a paper and video explainer from a few months ago.

https://www.microsoft.com/en-us/research/blog/claimify-extra...

Tainnor•11h ago

I think for that kind of thing, Prolog would be sufficient. Lean is more powerful, but it requires you to supply proofs - Prolog can just automatically run inferences.

(Although I wouldn't be surprised if somebody had already recreated something like Prolog in Lean with a tactic that does more or less what the Prolog interpreter does)

t_mann•11h ago

Careful, such an approach could easily end up giving an aura of logical objectivity to arbitrarily radical and nonsensical ideas. The political views of one of the fathers of modern logic may serve as a cautionary tale [0].

[0] https://en.m.wikipedia.org/wiki/Gottlob_Frege#Political_view...

poulpy123•10h ago

News and non-fiction articles are not math and cannot be treated as math. At best you could build a tool that check the most glaring contradictions (like a typo changing a number), and I'm not even sure it can be consistent without building a software that understand language. At worse you would build a tool that spit bullshit that millions of people would treat as gospel

guyomes•10h ago

For the form, you might be interested in Ethica, by Spinoza [1]. On the other hand, for fact checking, the key concept seems to be trust in sources rather than logical consistency.

[1]: https://en.wikipedia.org/wiki/Spinoza%27s_Ethics

rossant•9h ago

This might be interesting to you: https://en.m.wikipedia.org/wiki/Stephen_Toulmin#The_Toulmin_...

BlarfMcFlarf•6h ago

What about proven facts that get disproven? Is there room to rethink your priors?

Waterluvian•22h ago

Wait… so are we basically compiling a dictionary of proofs all stemming from a handful of self-evident truths? And every further proof is just some logical aggregation of previous proofs?

Can someone please turn this into a Zachtronics style game?! I badly, direly want this. There’s a game called Euclidea that’s kind of like this for trigonometry and the whole premise of building a tower of logic is deeply attractive to me.

Is this what pure math is about? Is this what people who become math profs are feeling in their soul? The sheer thrill of adding to that dictionary of proofs?

Aside: I recall some famous mathematician had made a list of base proofs that you just hold to be true. Can someone remind me who, and/or what that list is called? I’m guessing they’re considered axioms.

hobs•22h ago

You probably are vaguely referencing the Principa Mathematica.

Tainnor•10h ago

That's a very outdated text that is mostly irrelevant for modern maths except for its historical importance.

treyd•22h ago

You might be thinking of Euclid's axioms, which defines points, lines, planes, etc, and that lines can be parallel. This is an interesting one because the system is violated if your space is not flat, like if you're on a sphere.

You could also be thinking of Zermelo-Fraenknel set theory (ZF/ZFC), which most of modern mathematics is ultimately based upon.

cnity•12h ago

This highlights I think what is an ultimate pitfall of something like Lean. It is something like this: when an assumption is broken or violated in mathematics, the whole field of mathematics can _grow_. Non-euclidean geometry, imaginary numbers, etc are all examples of this. Trying to cram maths into a programming language sounds like it would constrain it, removing the creativity and reducing all maths to a search space.

RossBencina•12h ago

Lean does not reduce the mathematical search space as you suggest. Yes there is a fixed, compact low-level logical core that everything above it depends on. But this is equivalent to an encoding of the logical foundations that mathematics, that formal mathematics depends on in any case. On top that you have mathematical theories built on assumptions (axioms) and you can specify whatever axioms you like and change them at will. To use your analogy: the "search space" is parameterised by user-defined sets of axioms and assumptions.

Waterluvian•11h ago

Could one possibly use Lean to make up their own mathematical model that is wrong but mostly holds up for a while? Kind of like a Sudoku where it’s all working and then you realize one thing doesn’t work so the rest has to be torn down.

I have young kids, and in exploring why boats float I suggested the four elements model. They tested it with bottles full of air and water and earth and it all kind of held up until they found a bowl that floats. Making a wrong model helped the learning process more than telling them the right model. I loved every minute of the science.

Tainnor•10h ago

You can add any axiom you want in Lean and then see what follows (although what follows might be that you prove a contradiction). See the article for details.

I don't know if there's a way to remove axioms, nor do I think you can change the logical foundations (which are based in type theory).

daxfohl•1h ago

And even aside from defining your own axioms, you can have the same experience just by taking the wrong path in a proof. You'll frequently try something and the goal checker will say that the only thing left is to prove 1 < 0 or such. That's obviously impossible, but it doesn't mean the theorem you're trying to prove is wrong, it just means that you've painted yourself into a corner with your proof attempt, and need to back up and try something else.

Tainnor•1h ago

Dually, sometimes you'll have something like 1 < 0 as an assumption, which means you can close any goal.

WorldMaker•48m ago

One view of this article is that it is about doing exactly that, building an intentionally broken axiom and playing with it until it breaks down. (Shows an example "math_is_haunted" axiom that "proves" 2 + 2 = 6 and then also the contradictory state that 2 + 2 != 6.)

ants_everywhere•22h ago

> Can someone please turn this into a Zachtronics style game?!

That game is called math :) Partially joking, but I do think a game version would be fun.

> Is this what pure math is about?

More or less yes for an undergrad, but when you get to research it feels different.

> I badly, direly want this

Consider checking out an abstract algebra book. Maybe Dummit and Foote or something else popular. Proofs in algebra often have a satisfying game-like quality. The more popular books will have solutions you can find online if you get stuck.

danabramov•20h ago

>More or less yes for an undergrad, but when you get to research it feels different.

Would you mind telling more about how it feels different in research?

ants_everywhere•19h ago

You can get into a meditative zone when you're manipulating equations using techniques you've internalized.

In research you're doing a lot more of things like reading papers, putting your thoughts into words, trying to understand something the author of the paper barely understands, feeling lost and unsure where to look next. All of that can feel good too (or it cannot depending on the person) but it's a different feeling than playing a logic game.

As one example, Euclidia is a fun meditative game. But compare the difference in feeling between doing an exercise from Euclid and trying to prove the parallel postulate. It took centuries to realize you couldn't prove it, and then there was a lot of hard work trying to figure out what geometry was like if you get rid of it.

082349872349872•16h ago

> It took centuries to realize you couldn't

Millennia, even

RossBencina•12h ago

> Proofs in algebra often have a satisfying game-like quality.

Interesting. I find them banal and deeply unsatisfying.

ants_everywhere•10h ago

That's fine. Different people prefer different subjects. But IME the mode graduate student finds algebra to be an enjoyable class because of the proofs.

swagmoney1606•21h ago

In my mind this is literally what math is. We start with axioms, and derive conclusions. There's probably more to it than that, but that's the understanding I'm at now.

JonChesterfield•20h ago

Choosing the axioms is difficult.

petesergeant•16h ago

Presumably made easier by something like Lean where you can have a very minimal set of axioms, because things you might use as axioms already have proved versions, in Lean.

galaxyLogic•16h ago

But shouldn't it also be part of the axioms what are the rules that allow you to derive new theorems from them?

So then you could self-apply it and start ... deriving new rules of how you can derive new theorems and thus also new rules, from axioms?

I'm jusr confused a bit about "axioms" and "rules". What's the difference?

sroelants•9h ago

The rules that you use to compose axioms and propositions are a different set of axioms defined by the Logic system you're using. e.g., can a proof consist of infinitely many steps? Can I use the law of excluded middle? Some logic systems won't let you re-use the same proposition more than once, etc,...

They're usually considered separate, because they're orthogonal to the foundational axioms you're using to build up your mathematical systems. With the exact same system of axioms, you might be able to prove or disprove certain things using some logic systems, but not others.

danabramov•21h ago

Check out Terence Tao’s book called Analysis. It is sometimes challenging but it opened that world for me.

wging•20h ago

There is a game already, though it may not be exactly what you want (and the goal is definitely not "generate all known math"). I've played around with it, and I do think it's pretty fun.

The article mentions it, in fact: https://adam.math.hhu.de/#/g/leanprover-community/nng4

Waterluvian•19h ago

How did I miss this! Thanks for pointing it out. This is scratching the itch.

magicalhippo•17h ago

Had a blast with that game, really scratched that puzzle itch, and got to relearn some long-forgotten knowledge.

SilasX•6h ago

I played through a lot of it, and while it was fun, I wouldn't call it gamified in the sense that Zachtronics does it. As an FYI, here are some sticking points I ran into as well:

1) It uses the term "goal" for the the current stage of the proof you're working on, which is counterintuitive. In common speech, the goal is the theorem you're trying to prove, not the current step you're on in proving it. It would be like if I were helping you get to work in the morning, and I said, "Okay, you're current [sub]goal is that you're done brushing your teeth." No. That's not a goal. That's progress toward a goal.

2) It doesn't warn you early on about the left-associativeness of the operator binding, or what to do about it. So you can apply what seem like valid transformations but then parentheses pop up that you don't expect, and which you have no guidance on how to deal with. I had to ask ChatGPT what was going on to solve it.

derdi•3h ago

> It would be like if I were helping you get to work in the morning, and I said, "Okay, you're current [sub]goal is that you're done brushing your teeth." No. That's not a goal. That's progress toward a goal.

Interesting. Let's say our overall goal is to get to work showered, wearing clean clothes, and having brushed our teeth. After we showered and put on clean clothes, wouldn't you say that the remaining goal is to brush our teeth and get to work? And if we then do brush our teeth, wouldn't you say that the remaining goal is to get to work? In this scenario, if anything can be called progress, it is what's in the past (the showering etc.), not what is still to be done.

jheitmann•20h ago

Check out the game Bombe [1]. It's a minesweeper variant where instead of directly flagging or uncovering cells, you define rules for when cells can be flagged. As it gets more advanced you end up building lemmas that implicitly chain off each other. Then as _you_ get more advanced (and the game removes some arbitrary restrictions around your toolset) you can generalize your rules and golf down what you've already constructed.

[1] https://steamcommunity.com/app/2262930

hcs•18h ago

Some good mentions elsewhere in the thread, another to check out is The Incredible Proof Machine https://incredible.pm/

foooooobar•10h ago

Fun fact: the author of the The Incredible Proof Machine (Joachim Breitner) also works on Lean :-)

booleandilemma•16h ago

How are you asking questions like this and yet you're a roboticist and a principal software engineer in the autonomous mobile robotics industry.

From your questions I marked you as a high schooler.

Waterluvian•10h ago

Probably similar to how numerous maths PhDs I’ve worked with are terrible at shipping durable production code on time and on budget, or how many doctors are terrible at inserting IVs, or how an architect may likely be terrible at framing a house, or how some Grammy winning musicians can't read classical notation: your mental model for domains of expertise is probably wrong.

Tainnor•11h ago

> so are we basically compiling a dictionary of proofs all stemming from a handful of self-evident truths

I would say, "from a handful of axioms".

It's certainly true that when Euclid started this whole enterprise, it was thought thax axioms should be self-evident. But then, many centuries later, people discovered that there are other interesting geometries that don't satisfy the same axioms.

And when you get to reasoning about infinities, it's very unclear that anything about them can be considered self-evident (a few mathematicians even refuse to work with infinities, although it's definitely a very niche subcommunity).

Some of today's common axioms are indeed self-evident (such as "you should be able to substitute equal subterms"), but things like the axiom of choice have (at least historically) been much more controversial. I would probably say that such axioms can be considered "plausible" and that they generally allow us to be able to prove what we want to prove. But you'll definitely find mathematicians championing different axioms.

> Aside: I recall some famous mathematician had made a list of base proofs that you just hold to be true. Can someone remind me who, and/or what that list is called? I’m guessing they’re considered axioms.

That would be the ZFC axioms. It was originally the ZF axioms (named so after the mathematicians Zermelo and Fraenkel who worked in the early 20th century), and then later the Axiom of Choice (C) was added. It's generally considered to be the "standard" set of axioms for maths, although very few mathematicians actually work directly from these axioms. But in theory, you can take almost every mathematical proof (unless it's explicitly set in some other foundation) and recast it entirely in applications of the ZFC axioms.

Waterluvian•10h ago

Thanks for this. So I have half a thought I’m trying to flesh out from what you’ve shared. Bear with me, whoever reads this:

Are there essentially two flavours:

- the maths based on axioms that are fundamentally, cosmically true such as x = x. And in doing so we’re formalizing on paper the universal truths and all subsequent rules we know to be true given these

- the maths that incorporate those plus additional axioms that aren’t necessarily fundamentally true (or maybe just not provable), but work very well at laying a foundation for further rules that build a practically useful, but not necessarily “correct” toolbox

With the latter, is it kind of a “they won’t hold up under normal conditions but if you accept these axioms, there’s interesting things you can explore and achieve?”

Bjartr•9h ago

A thing to realize here is that there is no "fundamentally, cosmically true" in math. While math can be used to model reality, it is not bound by reality.

The only thing that matters is what you choose to take as granted for a particular question.

It's like how you can draw a map of a place that doesn't exist.

Or like coming up with rules for a game and then trying it to see how it plays.

Or it's like a material used for construction. You can build a house out of it, but there's no inherent "houseness" to it, despite how common such a use is.

Tainnor•7h ago

I would maybe rephrase it as: there are certain axioms that absolutely nobody reasonable takes issue with and then there are others that are more controversial, although it's still important to note that the vast majority of mathematicians accept the ZFC axioms + classical logic.

daxfohl•6h ago

The first set is the rules of logic, which applies in non math stuff too. From there you can add new axioms for boolean logic, for integers, for sets, for geometry, whatever.

Even given that separation (which itself is fuzzy), even the first group can't be construed as "universally true". For example there's a range of opinion around the law of the excluded middle (whether "P is not false" implies "P is true"). Most other propositional logic axioms like modus ponens are less controversial though.

As far as real world math, while ZF set theory axioms are generally viewed as the "foundation", that's due to convention more than any real primacy of ZF. Other set theories and types of "foundations" exist that seem to be just as suitable to be called a "foundation", and most math is "foundation" agnostic. Like, if all you're doing is something with prime numbers, then it doesn't matter what "foundation" you use; so long as it lets you define prime numbers, that's all that matters.

"Foundations" only come into play when you're doing really subtle things with different orders of infinities. And then, yes, the answer can be different depending on what foundation you choose. And that's fine. It doesn't mean that either foundation is wrong. They're just different. Which is why "foundation of mathematics" is a bit of a misnomer / fool's errand. Different foundations have different results on certain edge cases, and those are interesting to investigate, not something to be upset about. And like I said, most "ordinary" math is pretty foundation agnostic.

carodgers•5h ago

Take a look at zeroth-order logic.

andoando•22h ago

Is there a standard library/repository of all existing mathematical proofs one can add to?

ants_everywhere•22h ago

There are a few.

The lean one is at https://github.com/leanprover-community/mathlib4

DoctorOetker•20h ago

I possess a proof of FLT, and will publish the metamath formalization, in due time (first I need to build a secure display, for reasons that will become clear).

olddustytrail•9h ago

Fermat's Last Theorem is easily solved with css. Just do:

    margin-left: 200px;

and you're sorted.

DoctorOetker•8h ago

people jest, but they will jest less once they think a little more critically: how can cryptography experts expound the security of such and such cryptographic primitives involving discrete numbers, exponentiation, ... but be unable to come up with alternative proofs of FLT? ... but be unable to individually formalize existing supposed proof of FLT? ... but be unable to find a more succinct proof of FLT? who knows their way around the block, if the consensus is that Fermat did not possess a proof but Wiles did finally find a proof, who knows their way around the block, if a much more succinct proof than Wiles long-winded-anc-to-this-date-not-formally-verified-proof?

would you jest less?

olddustytrail•8h ago

I probably wouldn't. I'm in a jovial mood and I've managed to joke about many of the worst parts of my life.

There are many worse things happening in the world than issues in mathematics. You seem in a serious mood but these things too shall pass.

Tainnor•7h ago

> how can cryptography experts expound the security of such and such cryptographic primitives involving discrete numbers, exponentiation, ... but be unable to come up with alternative proofs of FLT? ... but be unable to individually formalize existing supposed proof of FLT? ... but be unable to find a more succinct proof of FLT?

I have no idea why you think there's a contradiction in here somewhere.

DoctorOetker•6h ago

The question is not if the described behavior is happening, but how credible their claims of cryptographic security are for the cryptographic primitives we depend on en masse.

Apart from unconditional security protocols, the safety of the cryptographic primitives is never proven, but insinuated by the lack of a public disproof.

How can consensus agreement be satisfied with the situation that 1) FLT may have been proven by Wiles 2) But has not been formally verified yet 3) We assume Fermat could not have found a proof, which insinuates that 4) a succinct proof is assumed to be impossible unless 5) we collectively underestimate Fermat / the power of individual human brains / sheer dedication 6) while pretending there is little to no connection between FLT and public key encryption schemes.

Tainnor•3h ago

I have no idea how these things are related.

In any case, to my knowledge, no cipher has ever unconditionally been proven secure except the one time pad. We just have a bunch of conditional security proof that are correct if the underlying assumptions (e.g. factoring primes is hard) are correct. Critically, I think all (?) such proofs only work if P != NP, which still remains unproven.

> 1) FLT may have been proven by Wiles

The "may" is misplaced here. Wiles's proof has been extensively reviewed, just because it hasn't been formalised doesn't mean it's wrong.

kevinbuzzard•33m ago

Indeed. I'm not formalising FLT because I think it might be wrong -- I'm formalising it because I know the proof is correct, and using the project as an excuse to get some modern number theory into Lean's mathematics library. My hope is this will increase the chances that systems like Lean will one day be able to help modern mathematicians.

nowittyusername•19h ago

I wonder suppose you are not relying on any tricks or funky shenanigans. is it possible to just throw random stuff at Lean and find interesting observations based if it approves? like use an automated system or an llm that tries all types of wild proofs/theories? and sees if it works? maybe im asking wrong questions though or not stating it well.. this is above my understanding, i could barely get my head around prolog.

Paracompact•15h ago

As someone who does certified programming professionally, I believe generative AI and formal methods are a match made in heaven. I might even go so far as to wager that the notion that human programmers will or will not be replaced by LLM-based AI is entirely dependent on whether these AI can get good at certified programming + compositional reasoning.

> is it possible to just throw random stuff at Lean and find interesting observations based if it approves?

Traditional AI has an easy time with checkers, because the search space is small. Chess is a little harder. Go still cannot be reasonably tackled by non-machine learning AI. Meanwhile, the search space (number of available moves together with the diversity of explorable states) for a sophisticated formal language becomes unfathomably large.

When the problem is known to be of a certain nature, often you can time-efficiently brute force it via SMT solvers. Traditionally SMT solvers and proof assistants have been separate branches of formal methods, but they're finally learning to play off each other's strengths (cf. Sledgehammer, Lean-SMT).

> like use an automated system or an llm that tries all types of wild proofs/theories? and sees if it works?

Research in this vein needs to be made more mainstream. I'm excited, though, that there have been big funders behind these ideas for years now, even before LLMs became big. Cf. "Learning to Find Proofs and Theorems by Learning to Refine Search Strategies" for earlier work, or DeepSeek-Prover for more recent attempts. I'm no machine learning expert, but it seems it's still a very open question how to best train these things and what their future potential is.

All in all mainstream LLMs are still rather mediocre at languages like Rocq and Lean. And when they're wrong, their proof scripts are extremely tedious to try to troubleshoot and correct. But I have hope AI tooling in formal methods will mature greatly over time.

benreesman•10h ago

I'm already doing almost all my LLM-assisted programming in Haskell, all the devops in check-heavy Nix, and starting to move everything into Dhall.

There's no way that Python written by LLMs will survive contact with Haskell written by LLMs in a standup competition.

One imagines the scope for this sort of thing goes very high in formality.

fragmede•3h ago

Serious question: Why not have it generate assembly?

Paracompact•3h ago

Yeah, it makes me sad that the skill of LLMs at a language is directly proportional to the popularity of a language, which is itself inversely proportional to the formal guardrails placed on the language by a type system.

danabramov•11h ago

This is an active area of research and experimentation!

Much of Lean community are on Zulip (which is kind of like a forum?) and you can see many relevant threads here: https://leanprover.zulipchat.com/#narrow/channel/219941-Mach...

thresholds•17h ago

Paging Terrence Howard

ljlolel•17h ago

Maybe before we have AGI we should get an AI that can translate Andrew’s proof into Lean for us. Easily tractable, checkable, useful, and build-upon-able

grumbelbart2•16h ago

Work is already underway in that direction:

https://github.com/deepseek-ai/DeepSeek-Prover-V2

but also

https://deepmind.google/discover/blog/alphaevolve-a-gemini-p...

Terence Tao is doing a lot of work in this direction.

zozbot234•12h ago

Kevin Buzzard is reportedly working on formalizing a modern proof of FLT. This effort has already managed to surface some unsound arguments in one of the prereqs for the proof (namely crystalline cohomology) https://xenaproject.wordpress.com/2024/12/11/fermats-last-th... though the issue has since been fixed.

Karliss•15h ago

Is there a way to read lean proofs noninteractively.

After playing with the natural number a game a bit, proofs quickly ended up being opaque sequences of "rw [x]" commands which felt unreadable. It's nice that the editor allows interactively viewing the state at different points, but having to click on each line ruins the flow of reading. Imagine if you had to read python code which has no indentation, braces or anything similar and only way to know where if statement ends or an else block starts is by clicking on each line. My impression might be influenced by the limited vocabulary that the early levels of natural number game provides. Does the richer toolset provided by full lean make it easier to make proofs readable without requiring you to click on each line to get the necessary context?

RossBencina•11h ago

> Is there a way to read lean proofs noninteractively.

I'd like to see an answer to this question. I was looking into it the other day.

I found this: https://xenaproject.wordpress.com/2019/02/11/lean-in-latex/ Which gives a way to do the clicking-through outside the editor. And maybe gives some insight into how Lean people see things.

danabramov•11h ago

This is a great question! I’m still not very experienced so take it with a grain of salt. But here’s my take on it.

I’ve been spending time with Lean quite a bit for the past few months. When I look at a proof, I don’t “read” it the same way as I do with programming. It feels a bit more like “scanning”. What stands out is the overall structure of the argument, what tactics are used, and what lemmas are used.

In real Lean code, the accepted style is to indent any new goals, work on one goal at a time (except a few special parallel tactics), and outdent when the goal is done. That’s what I mean by the “shape” of the argument. See some examples in this PR I’m working on: https://github.com/gaearon/analysis-solutions/pull/7/files

Once you’re familiar with what tactics do, you can infer a lot more. For example `intro` steps into quantifiers and “eats” assumptions, if I see `constructor` I know it’s breaking apart the goal into multiple, and so on.

Keep in mind that in reality all tactics do is aid you in producing a tree of terms. As in, actually the proofs are all tree-shaped. It’s even possible to write them that way directly. Tactics are more like a set of macros and DSL for writing that tree very concisely. So when I look at some tactics (constructor, use, refine, intro), I really see tree manipulation (“we’re splitting pieces, we’re filling in this part before that part, etc”).

However, it’s still different from reading code in the sense that I’d need to click into the middle to know for sure what assertion a line is dealing with. How much this is a problem I’m not sure.

In a well-written proof with a good idea, you can often retrace the shape of the argument by reading because the flow of thought is similar to a paper proof. So someone who wants to communicate what they’re doing is generally able to by choosing reasonable names, clear flow of the argument, appropriate tactics and — importantly! — extracting smaller lemmas for non-obvious results. Or even inline expressions that state hypotheses clearly before supplying a few lines of proof. On the other hand, there are cases where the proof is obvious to a human but the machine struggles and you have to write some boilerplate to grind through it. Those are sometimes solved by more powerful tactics, but sometimes you just have to write the whole thing. And in that case I don’t necessarily aim for it being “understandable” but for it being short. Lean users call that golfing. Golfed code has a specific flavor to it. Again in my experience golfing is used when a part of the proof would be obvious on paper (so a mathematician wouldn’t want to see 30 lines of code dedicated to that part), or when Lean itself is uniquely able to cut through some tedious argument.

So to summarize, I feel like a lot of it is implicit, there are techniques to make it more explicit when the author wants, it doesn’t matter as much as I expected it would, and generally as your mental model of tactics improves, you’ll be able to read Lean code more fluidly without clicking. Also, often all you need to understand the argument is to overview the names of the lemmas it depends on. Whereas the specific order doesn’t matter because there’s a dozen way to restructure it without changing the substance.

derdi•9h ago

Rocq used to have a "mathematical proof language". It's hard to find examples, but this shows the flavor (https://stackoverflow.com/a/40739190):

    Lemma foo:
      forall b: bool, b = true -> (if b then 0 else 1) = 0.
    proof.
      let b : bool.
      per cases on b.
        suppose it is true. thus thesis.
        suppose it is false. thus thesis.
      end cases.
    end proof.
    Qed.

So the idea was to make proofs read more like "manual" proofs as you would find them in math papers. Apparently nobody used this, so it was removed.

Isabelle's Isar proof language is similar, and AFAIK the standard way of proving in Isabelle (example from https://courses.grainger.illinois.edu/cs576/sp2015/doc/isar-...):

    lemma "map f xs = map f ys ==> length xs = length ys"
    proof (induct ys arbitrary: xs)
      case Nil thus ?case by simp
    next
      case (Cons y ys) note Asm = Cons
      show ?case
      proof (cases xs)
        case Nil
        hence False using Asm(2) by simp
        thus ?thesis ..
      next
        case (Cons x xs’)
        with Asm(2) have "map f xs’ = map f ys" by simp
        from Asm(1)[OF this] ‘xs = x#xs’‘ show ?thesis by simp
      qed
    qed

You spell out the structure and intermediate results you want, and the "by ..." blocks allow you to specify concrete tactics for parts where the detains don't matter and the tactic does what you want. In this proof, "simp" (a kind of simplify/reflexivity tactic) is enough for all the intermediate steps.

I don't know if there is anything like this for Lean, but maybe this provides some keywords for a web search. Or inspiration for a post in some Lean forum.

bravesoul2•11h ago

I love the article. Few can cross the bridge and describe these sorts of things in an easy to digest way. The secret is showing all the tiny steps experts might not see as it is too obvious. Thank you!

danabramov•10h ago

Thanks!

Western0•10h ago

Better https://en.wikipedia.org/wiki/Mizar_system Many books create, many proof

fjfaase•9h ago

I understand that the Mizar community is rather closed and primarily focused on extending the Mizar Mathematical Library. The Mizar proof checker is closed source.

Lean is gaining traction, which can be seen from the fact that at the moment 81 [1] of the 100 theorems of the 'Formalizing 100 Theorems' [1] have been proven in Lean, while Mizar stands at 71 [3]

[1] https://leanprover-community.github.io/100.html

[2] https://www.cs.ru.nl/%7Efreek/100/

[3] https://mizar.uwb.edu.pl/100/

zozbot234•8h ago

Mizar is actually available under a GPL 3 license (thus FLOSS) from https://github.com/MizarProject/system . There's an experimental Rust reimplementation that's at least 5x faster than the original: https://github.com/digama0/mizar-rs

derdi•9h ago

Nitpick, but it's a bit strange to say that the two_eq_two theorem looks like a function. It looks more like a constant, since it has no arguments. (Yes I know that constants are nullary functions.)

I would find the following a more convincing presentation:

    theorem x_eq_x (x:nat) : x = x := by
      rfl

    theorem 2_eq_2 : 2 = 2 := by
      exact (x_eq_x 2)

Here x_eq_x looks like a function, and in 2_eq_2's proof we apply it like a function.

danabramov•9h ago

Fair! I decided not to do that because the way arguments work (and dependent types in general — like being able to return a proof of `x = x` given x) is unusual for people like me, and deserves an article of its own. So I’m kicking that to one of the next articles.

barrenko•9h ago

As an abstract rule of thumb, how much would one have to beef up on logic before attempting to screw around analysis and something like Lean?

daxfohl•9h ago

None at all. Hit the natural numbers game that was referenced and you can start proving basic things like 2+2=4, n+0=n on up to associativity, commutativity, etc of basic operations, etc.

barrenko•6h ago

Appreciate it!

harperlee•9h ago

Perhaps this thread is a good place to ask, could anyone contribute their own opinion about the relative future of lean vs. idris/coq/agda? I want to dedicate some time to this from a knowledge representation point of view, but I'm unsure about which of them will have less risk of ending up as so many esoteric languages... I sank a lot of time on clojure core.logic for a related project and got burnt with the low interest / small community issue already, so I've been hesitant to start with any of them for some time.

armchairhacker•8h ago

IME Lean and Coq/Rocq are used more in practice, and have bigger libraries and communities, than Idris and Agda.

Rocq is the most common for program verification, but I suspect mainly due to it being older, and it has weird quirks due to its age, so Lean may catch up. Lean is the most common for proving mathematical theorems.

Big projects verified in Rocq include CompCert, CertiCoq, and sel4. Additionally, some large companies use Rocq to verify critical software like in airplanes (there's a list at https://github.com/ligurio/practical-fm although it may not be accurate). Big projects in Lean include mathlib (collection of various mathematical proofs), and the ongoing work to prove Fermat's Last Theorem (https://imperialcollegelondon.github.io/FLT/) and PFR (https://teorth.github.io/pfr/). I'm not aware of "real-world" projects in Idris and Agda but may be wrong.

That said, they're all small communities compared to something like C++ or JavaScript. Moreover, verifying programs is very slow and tedious (relative to writing them), so I wouldn't be surprised if we see a big breakthrough (perhaps with AI) that fundamentally changes the landscape. But remember that even with a breakthrough your skills may be transferable.

daxfohl•7h ago

I wouldn't put much money on any of them. IME most mathematicians aren't that interested in formalization, and the gulf between a hand written proof and and a computer verified syntax is pretty huge.

Theyre interesting to learn and play with for their own sake, but I'd be reluctant to make any bets on the future of any of them.

If I had to choose, Lean seems to have the most momentum, though the others have been around longer and each has some loyal users.

kevinbuzzard•29m ago

Most mathematicians aren't doing formalization themselves, but my impression is that a lot of them are watching with interest. I get asked "is my job secure?" quite a lot nowadays. Answer is "currently yes".

77pt77•8h ago

Archive:

https://web.archive.org/web/20250731115617/https://overreact...

ninetyninenine•8h ago

I wonder if there’s any physical phenomena that can be modeled with inconsistent math.

Like we assume everything is consistent but maybe it’s not.

Slow

Releasing open weights for FLUX.1 Krea

The Anti-Abundance Critique on Housing Is Dead Wrong

QUIC for the kernel

A Hitchhiker's Guide to the AI Bubble

Show HN: Mcp-use – Connect any LLM to any MCP

Gemini Embedding: Powering RAG and context engineering

MacBook Pro Insomnia

Ubiquiti launches UniFi OS Server for self-hosting

How was the Universal Pictures 1936 opening logo created?

Secure boot certificate rollover is real but probably won't hurt you

Show HN: Sourcebot – Self-hosted Perplexity for your codebase

Denver rent is back to 2022 prices after 20k new units hit the market

Many countries that said no to ChatControl in 2024 are now undecided

Introduction to Computer Music

I tried Servo

Kaizen (YC X25) is hiring engineers to build browser agents that work

Kaleidos – A portable nuclear microreactor that replaces diesel generators

Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code

Show HN: AgentMail – Email infra for AI agents

We revamped our docs for AI-driven development

I made a website that makes you cry

AI is a floor raiser, not a ceiling raiser

Face it: you're a crazy person

Ferrari Status

Carbon Language: An experimental successor to C++

Astronomical Telescope “Hadley” – an easy assembly, high performance Newtonian

Benchmarking MicroPython

Zig Profiling on Apple Silicon

The Chrome Speculation Rules API allows the browser to preload and prerender

Slow

Releasing open weights for FLUX.1 Krea

The Anti-Abundance Critique on Housing Is Dead Wrong

QUIC for the kernel

A Hitchhiker's Guide to the AI Bubble

Show HN: Mcp-use – Connect any LLM to any MCP

Gemini Embedding: Powering RAG and context engineering

MacBook Pro Insomnia

Ubiquiti launches UniFi OS Server for self-hosting

How was the Universal Pictures 1936 opening logo created?

Secure boot certificate rollover is real but probably won't hurt you

Show HN: Sourcebot – Self-hosted Perplexity for your codebase

Denver rent is back to 2022 prices after 20k new units hit the market

Many countries that said no to ChatControl in 2024 are now undecided

Introduction to Computer Music

I tried Servo

Kaizen (YC X25) is hiring engineers to build browser agents that work

Kaleidos – A portable nuclear microreactor that replaces diesel generators

Launch HN: Gecko Security (YC F24) – AI That Finds Vulnerabilities in Code

Show HN: AgentMail – Email infra for AI agents

We revamped our docs for AI-driven development

I made a website that makes you cry

AI is a floor raiser, not a ceiling raiser

Face it: you're a crazy person

Ferrari Status

Carbon Language: An experimental successor to C++

Astronomical Telescope “Hadley” – an easy assembly, high performance Newtonian

Benchmarking MicroPython

Zig Profiling on Apple Silicon

The Chrome Speculation Rules API allows the browser to preload and prerender

The Math Is Haunted

Comments