frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Coming Need for Formal Specification

https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/
22•todsacerdoti•5h ago

Comments

readthenotes1•3h ago
Some realized that building the tests was the more important part of writing the software long ago...

Test tables and scenarios may not cover all the things that can go wrong (a la Goëdel), but not doing them almost guarantees broken software.

jackblemming•3h ago
Whenever you look closely at what these proof nerds have actually built you typically find… nothing. No offense to them, it’s simply reality.
AlotOfReading•3h ago
SeL4, a number of mathematical theorems, a bunch of cryptography. You've likely trusted your life to compcert. It's not nothing, but it's admittedly a bit limited.

Formal methods are the hardest thing in programming, second only to naming things and off by one errors.

atomicnature•2h ago
Leslie Lamport built latex, most of distributed systems such as AWS services depend on formal verification. The job of Science here is to help Engineering with managing complexity and scale. The researchers are doing their jobs
ofrzeta•2h ago
What does LaTeX have to do with TLA+? Also I think "most of distributed systems such as AWS" might be an exaggeration. At least the public known examples of formal verification in AWS are scarce.
MangoToupe•2h ago
I think the implication is that Lamport is a proof nerd, not that LaTeX has a direct relationship to proof software.
scrubs•1h ago
I see no need to dunk like that. There are ample stories over the years on HN how software orgs without using FM have bugs, waste money, treat people poorly, all leading to canceling software projects before delivering anything to customers. And I'm only mentioning just a very few issues. Software development in corps has many challenges to ROI and customer satisfaction.

I might also point out FM had a nice history of value-add in HW. And we know HW is higher quality than software.

Animats•3h ago
The trouble with formal specification, from someone who used to do it, is that only for some problems is the specification simpler than the code.

Some problems are straightforward to specify. A file system is a good example. The details of blocks and allocation and optimization of I/O are hidden from the API. The formal spec for a file system can be written in terms of huge arrays of bytes. The file system is an implementation to store arrays on external devices. We can say concisely what "correct operation" means for a file system.

This gets harder as the external interface exposes more functionality. Now you have to somehow write down what all that does. If the interface is too big, a formal spec will not help.

Now, sometimes you just want a negative specification - X must never happen. That's somewhat easier. You start with subscript checking and arithmetic overflow, and go up from there.

That said, most of the approaches people are doing seem too hard for the wrong reasons. The proofs are separate from the code. The notations are often different. There's not enough automation. And, worst of all, the people who do this stuff are way into formalism.

If you do this right, you can get over 90% of proofs with a SAT solver, and the theorems you have to write for the hard cases are often reusable.

akoboldfrying•1h ago
> Now you have to somehow write down what all that does.

I think the core difficulty is that there's no way to know whether your spec is complete. The only automatic feedback you can hope to get is that, if you add too many constraints, the prover can find a contradiction between them. But that's all (that I'm aware of, at least).

Let's take an extremely simple example: Proving that a sort algorithm works correctly. You think, "Aha! The spec should require that every element of the resulting list is >= the previous element!", and you're right -- but you are not yet done, because a "sorting algorithm" that merely returns an empty list also satisfies this spec.

Suppose you realise this, and think: "Aha! The output list must also be the same size as the input list!" And again, you're right, but you're still not done, because a "sorting algorithm" that simply returns inputSize copies of the number 42 also satisfies this new spec.

Suppose you notice this too, and think: "Aha! Every element in the input should also appear the same number of times in the output!" You're right -- and now, finally, your spec is actually complete. But you have no way to know that, so you will likely continue to wonder if there is some additional constraint out there that you haven't thought of yet... And this is all for one of the tidiest, most well-specified problems you could hope to encounter.

scrubs•1h ago
And ... so what? Nobody ever said specs are error free or complete. Heck, you didn't even get into liveness, dead lock issues.

The salient question: is risk reduced for the time alotted to write a spec in say spin or tla+?

Formal specs are risk reduction not magic.

akoboldfrying•1h ago
> And ... so what?

If you don't think clearly elucidating the specific issue holding back wider adoption of an otherwise amazing technology is relevant to this discussion, I don't know what to tell you.

bassp•1h ago
> Some problems are straightforward to specify. A file system is a good example.

I’ve got to disagree with this - if only specifying a file system were easy!

From the horse’s mouth, the authors of the first “properly” verified FS (that I’m aware of), FSCQ, note that:

> we wrote specifications for a subset of the POSIX system calls using CHL, implemented those calls inside of Coq, and proved that the implementation of each call meets its specification. We devoted substantial effort to building reusable proof automation for CHL. However, writing specifications and proofs still took a significant amount of time, compared to the time spent writing the implementation

(Reference: https://dspace.mit.edu/bitstream/handle/1721.1/122622/cacm%2...)

And that’s for a file system that only implements a subset of posix system calls!

nanolith•56m ago
I have been formally verifying software written in C for a while now.

> is that only for some problems is the specification simpler than the code.

Indeed. I had to fall back to using a proof assistant to verify the code used to build container algorithms (e.g. balanced binary trees) because the problem space gets really difficult in SAT when needing to verify, for instance, memory safety for any arbitrary container operation. Specifying the problem and proving the supporting lemmas takes far more time than proving the code correct with respect to this specification.

> If you do this right, you can get over 90% of proofs with a SAT solver

So far, in my experience, 99% of code that I've written can be verified via the CBMC / CProver model checker, which uses a SAT solver under the covers. So, I agree.

I only need to reach for CiC when dealing with things that I can't reasonably verify by squinting my eyes with the model checker. For instance, proving containers correct with respect to the same kinds of function contracts I use in model checking gets dicey, since these involve arbitrary and complex recursion. But, verifying that code that uses these containers is actually quite easy to do via shadow methods. For instance, with containers, we only really care whether we can verify the contracts for how they are used, and whether client code properly manages ownership semantics. For instance, placing an item into the container or taking an item out of a container. Referencing items in the container. Not holding onto dangling references once a lock on a container is released, etc. In these cases, simpler models for these containers that can be trivially model checked can be substituted in.

> Now, sometimes you just want a negative specification - X must never happen. That's somewhat easier.

Agreed. The abstract machine model I built up for C is what I call a "glass machine". Anything that might be UB or that could involve unsafe memory access causes a crash. Hence, quantified over any acceptable initial state and input parameters that match the function contract, these negative specifications must only step over all instructions without hitting a crash condition. If a developer can single step, and learns how to perform basic case analysis or basic induction, the developer can easily walk proofs of these negative specifications.

seg_lol•18m ago
> is that only for some problems is the specification simpler than the code.

Regardless of the proof size, isn't the win that the implementation is proven to be sound, at least at the protocol level, if not the implementation level depending on the automatic theorem prover?

rramadass•2h ago
Nada Amin's research mentioned in another thread is relevant here - https://news.ycombinator.com/item?id=46252034
seg_lol•16m ago
Nada Amin isn't relevant to hacker news, she is 15 years ahead of grifting tech bros.

OpenAI are quietly adopting skills, now available in ChatGPT and Codex CLI

https://simonwillison.net/2025/Dec/12/openai-skills/
323•simonw•8h ago•183 comments

macOS 26.2 enables fast AI clusters with RDMA over Thunderbolt

https://developer.apple.com/documentation/macos-release-notes/macos-26_2-release-notes#RDMA-over-...
372•guiand•11h ago•201 comments

Beautiful Abelian Sandpiles

https://eavan.blog/posts/beautiful-sandpiles.html
14•eavan0•3d ago•4 comments

GNU Unifont

https://unifoundry.com/unifont/index.html
202•remywang•11h ago•53 comments

1300 Still Images from the Animated Films of Hayao Miyazaki's Studio Ghibli (2023)

https://www.ghibli.jp/info/013772/
85•vinhnx•5h ago•15 comments

Rats Play DOOM

https://ratsplaydoom.com/
263•ano-ther•12h ago•94 comments

Ferrari's Formula 1 Handovers: Handovers from Surgery to Intensive Care 2008;pdf

https://gwern.net/doc/technology/2008-sower.pdf
44•bookofjoe•6d ago•13 comments

Show HN: Tiny VM sandbox in C with apps in Rust, C and Zig

https://github.com/ringtailsoftware/uvm32
119•trj•10h ago•7 comments

Sick of smart TVs? Here are your best options

https://arstechnica.com/gadgets/2025/12/the-ars-technica-guide-to-dumb-tvs/
255•fleahunter•19h ago•253 comments

Gild Just One Lily

https://www.smashingmagazine.com/2025/04/gild-just-one-lily/
13•serialx•4d ago•1 comments

Easel Now Has Stencils

https://easel.games/blog/2025-dec-update
6•BSTRhino•4d ago•1 comments

So What Should We Call This – A Grue Jay?

https://cns.utexas.edu/news/research/so-what-should-we-call-grue-jay
47•surprisetalk•5d ago•19 comments

Ensuring a National Policy Framework for Artificial Intelligence

https://www.whitehouse.gov/presidential-actions/2025/12/eliminating-state-law-obstruction-of-nati...
108•andsoitis•1d ago•170 comments

Show HN: I made a spreadsheet where formulas also update backwards

https://victorpoughon.github.io/bidicalc/
116•fouronnes3•1d ago•64 comments

50 years of proof assistants

https://lawrencecpaulson.github.io//2025/12/05/History_of_Proof_Assistants.html
80•baruchel•8h ago•12 comments

Freeing a Xiaomi humidifier from the cloud

https://0l.de/blog/2025/11/xiaomi-humidifier/
71•stv0g•1d ago•36 comments

The Coming Need for Formal Specification

https://benjamincongdon.me/blog/2025/12/12/The-Coming-Need-for-Formal-Specification/
22•todsacerdoti•5h ago•16 comments

Go is portable, until it isn't

https://simpleobservability.com/blog/go-portable-until-isnt
63•khazit•5d ago•54 comments

Apple has locked my Apple ID, and I have no recourse. A plea for help

https://hey.paris/posts/appleid/
348•parisidau•3h ago•159 comments

The Checkerboard

https://99percentinvisible.org/episode/650-the-checkerboard/
43•thread_id•7h ago•9 comments

Koralm Railway

https://infrastruktur.oebb.at/en/projects-for-austria/railway-lines/southern-line-vienna-villach/...
298•fzeindl•21h ago•176 comments

Capsudo: Rethinking sudo with object capabilities

https://ariadne.space/2025/12/12/rethinking-sudo-with-object-capabilities.html
61•fanf2•10h ago•34 comments

Doxers Posing as Cops Are Tricking Big Tech Firms into Sharing People's Data

https://www.wired.com/story/doxers-posing-as-cops-are-tricking-big-tech-firms-into-sharing-people...
46•iamnothere•3h ago•12 comments

Google Removes Sci-Hub Domains from U.S. Search Results Due to Dated Court Order

https://torrentfreak.com/google-removes-sci-hub-domains-from-u-s-search-results-due-to-dated-cour...
90•t-3•5h ago•20 comments

Slax: Live Pocket Linux

https://www.slax.org/
21•Ulf950•4d ago•3 comments

Building small Docker images faster

https://sgt.hootr.club/blog/docker-protips/
50•steinuil•21h ago•13 comments

Motion (YC W20) Is Hiring Senior Staff Front End Engineers

https://jobs.ashbyhq.com/motion/715d9646-27d4-44f6-9229-61eb0380ae39
1•ethanyu94•11h ago

Poor Johnny still won't encrypt

https://bfswa.substack.com/p/poor-johnny-still-wont-encrypt
44•zdw•4h ago•42 comments

String theory inspires a brilliant, baffling new math proof

https://www.quantamagazine.org/string-theory-inspires-a-brilliant-baffling-new-math-proof-20251212/
136•ArmageddonIt•15h ago•139 comments

Pg_ClickHouse: A Postgres extension for querying ClickHouse

https://clickhouse.com/blog/introducing-pg_clickhouse
89•spathak•2d ago•32 comments