I haven't gone looking but verifier tools compatible with languages people already use (typescript/rust/go/whatever is the flavour of the month) feel like the way to go
I'm exploring it now as a way to ease colleagues into SPARK. A lot of the material appears to transfer over and the book Program Proofs seems better to me than what I found for SPARK. I probably wouldn't have colleagues work through the book themselves so much as run a series of tutorials. We've done this often in the past when trying to bring everyone up to speed on some new skillset or tooling, if someone already knows it (or has the initiative to learn ahead of time) then they run tutorial sessions every week or so for the team.
The only* practical way to approach this is exactly Dafny: start with a (probably) consistent core language with well understood semantics, build on a surface language with syntax features that make it more pleasant to use, proving that the surface language has a translation to the core language that means semantics are clear, and then generate the desired final language after verification has succeeded.
Dafny's about the best of the bunch for this too, for the set of target languages it supports.
(It's fine and normal for pragmatic languages to be inconsistent: all that means here is you can prove false is true, which means you can prove anything and everything trivially, but it also means you can tell the damn compiler to accept your code you're sure is right even though the compiler tells you it isn't. It looks like type casts, "unsafe", and the like.)
* one could alternatively put time and effort into making consistent and semantically clear languages compile efficiently, such that they're worth using directly.
You can get pretty far in Haskell (with various extensions) and Scala. But for Go/TypeScript/etc, forget about it.
Truth is that everything involves a tradeoff, and some systems are better than others at different things. Dafny explores a particular design space. Hoare-style invariants are easier to use than dependent types (as long as your SMT solver is happy, anyway) but F* also has that, except that in F* you can also use dependent types when automatic refinement proofs become inadequate. And F* and ATS can target low-level, more so than Dafny.
Probably I would not use ATS for anything, but between F* and Dafny, there isn't such a clear cut (I'd most likely use F*).
And if I don't need (relatively) low-level, I wouldn't use either.
I'm currently using Lean myself, but Agda's probably going to win out longer term for me simply because it compiles to GHC Haskell.
Let alone something more complex in formal verification.
This procedure is unsurprisingly called sequentialization and (somewhat less unsurprisingly) is also a pretty good approach when applied to other techniques, such as bounded model checking [2].
Currently it has translation backends for C#, Java, JavaScript, Go and Python.
Lean/Adga are real programming languages, while Coq (Rocq), F*, ATS, Isabelle/HOL all extract to various other programming languages.
Frankly, it's TLA+ that is the odd one here.
I'm more familiar with Alloy, which is a great tool for exploring a specification and looking for counter-examples that violate your specification.
AFAIK, none of the languages you listed above work well in conceptualization phase. Are any of them capable of synthesizing counter-examples out of the box? (Aside: I feel like Lean's meta capabilities could be leveraged to do this.)
Never looked into Alloy, I guess have to get an understanding of it.
How can you validate that the beautiful design phase actually maps to e.g. C code, writing data via ODBC to a SQL database, with stored procedures written in PL/SQL?
Neither the semantics of the toolchains nor the semantics of the commercial products ar part of the TLA+ model as such.
Additionally it requires someone to meticulously compare the mathematical model with the implementation code, to validate that what is written actually maps to what was designed.
Although it wouldn't work for my contrieved example, at least tools like Dafny have more viability by going through "formal model => generate library for consumption", so that we can get an automated model to code validation, without human intervention.
This is a deficiency in TLA+ (and many other systems), but it's not a good enough reason to discard or dismiss it. The industry alternative to TLA+ is not something that traces fully and easily from spec to code, but mostly to use informal, largely prose, documents as their specification (if they specify anything at all). TLA+ is a massive improvement on that even if it's not a perfect tool. Same for Alloy and other systems. It's better if people model and specify at least portions of their system formally even if it still takes effort to verify the code correctly implements that specification, effort they have to expend anyways but with greater difficulty lacking any thing approaching a formal specification.
I'd be ecstatic if LLMs could write those tactic functions. Unfortunately, they're not great at the reasoning and preciseness required (plus their training data isn't exactly overflowing with Dafny proofs.)
Jtsummers•1d ago
This only had the one previous submission but I found it interesting. The mentioned book, Program Proofs, is worth checking out if the topic and language interests you.