frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Formal methods only solve half my problems

https://brooker.co.za/blog/2022/06/02/formal.html
74•signa11•5d ago

Comments

HPsquared•1d ago
Maybe they solve the first 90%, but not the other 90%.
chrisaycock•1d ago
The article points out that tools like TLA+ can prove that a system is correct, but can't demonstrate that a system is performant. The author asks for ways to assess latency et al., which is currently handled by simulation. While this has worked for one-off cases, OP requests more generalized tooling.

It's like the quote attributed to Don Knuth: "Beware of bugs in the above code; I have only proved it correct, not tried it."

throw-qqqqq•1d ago
There are methods of determining Worst Case Execution Time/WCET. I’ve been involved in real time embedded systems development, where that was a thing.

But one tool (like TLA+) can’t realistically support all formalisms for all types of analyses ¯\_(ツ)_/¯

pjmlp•1d ago
From my point of view, they cannot even prove that, because in most cases there is no validation if the TLA+ model actually maps to the e.g. C code that was written.

I only believe in formal methods where we always have a machine validated way from model to implementation.

jgalt212•1d ago
preach
pdhborges•1d ago
Well Coq has program extraction built in.
Ericson2314•23h ago
Yeah and that's why it's way better than the likes of TLA+.
ted_dunning•22h ago
See Dafny
pjmlp•20h ago
I know it, :)
NooneAtAll3•1d ago
what is P?
aw1621107•1d ago
Looks like it's this [0]:

> Distributed systems are notoriously hard to get right (i.e., guaranteeing correctness) as the programmer needs to reason about numerous control paths resulting from the myriad interleaving of events (or messages or failures). Unsurprisingly, programmers can easily introduce subtle errors when designing these systems. Moreover, it is extremely difficult to test distributed systems, as most control paths remain untested, and serious bugs lie dormant for months or even years after deployment.

> The P programming framework takes several steps towards addressing these challenges by providing a unified framework for modeling, specifying, implementing, testing, and verifying complex distributed systems.

It was last posted on HN about 2 years ago [1].

[0]: https://p-org.github.io/P/whatisP/

[1]: https://news.ycombinator.com/item?id=34273979

whinvik•1d ago
Nice, I actually understood a lot of that post since I am trying to teach myself formal methods. Wrote up a bit here - https://vikramsg.github.io/introduction-to-formal-methods-pa...
jadbox•1d ago
Are there any good formal method tools that work well with Node.js/Bun/Deno projects?
NovemberWhiskey•1d ago
Outside of a very narrow range of safety- or otherwise ultra-critical systems, no-one is designing for actual guarantees of performance attributes like throughput or latency. The compromises involved in guarantees are just too high in terms of over-provisioning, cost to build and so on.

In large, distributed systems the best we're looking for is statistically acceptable. You can always tailor a workload that will break a guarantee in the real world.

So you engineer with techniques that reduce the likelihood that workloads you have characterized as realistic can be handled with headroom, and you worry about graceful degradation under oversubscription (i.e. maintaining "good-put"). In my experience, that usually comes down to good load-balancing, auto-scaling and load-shedding.

Virtually all of the truly bad incidents I've seen in large-scale distributed systems are caused by an inability to recover back to steady-state after some kind of unexpected perturbation.

If I had to characterize problem number one, it's bad subscriber-service request patterns that don't provide back pressure appropriately. e.g. subscribers that don't know how to back-off properly and services that don't provide back-pressure. Classical example is a subscriber that retries requests on a static schedule and gives up on requests that have been in-flight "too long", coupled with services that continue to accept requests when oversubscribed.

amw-zero•23h ago
I think this is less about guarantees and more about understanding behavioral characteristics in response to different loads.

I personally could care less about proving that an endpoint always responds in less than 100ms say, but I care very much about understanding where various saturation points are in my systems, or what values I should set for limits like database connections, or how what the effect of sporadic timeouts are, etc. I think that's more the point of this post (which you see him talk about in other posts on his blog).

NovemberWhiskey•22h ago
I am not sure that static analysis is ever going to give answers to those questions. I think the best you can hope to do is surface knowledge about the tacit assumptions about dependencies in order to explore their behaviors through simulation or testing.

I think it often boils down to "know when you're going to start queuing, and how you will design the system to bound those queues". If you're not using that principle at design stage then I think you're already cooked.

amw-zero•17h ago
Who brought up static analysis?

I think simulation is definitely a promising direction.

AlotOfReading•22h ago
It's just realtime programming. I wouldn't say that realtime techniques are limited to a very narrow range of ultra critical systems, given that they encompass everything from the code on your SIM card to games in your steam library.

    In large, distributed systems the best we're looking for is statistically acceptable. You can always tailor a workload that will break a guarantee in the real world.
This is called "soft" realtime.
NovemberWhiskey•21h ago
"Soft" realtime just means that you have a time-utility function that doesn't step-change to zero at an a priori deadline. Virtually everything in the real world is at least a soft realtime system.

I don't disagree with you that it's a realtime problem, I do however think that "just" is doing a lot of work there.

AlotOfReading•20h ago
There are multiple ways to deal with deadline misses for soft systems. Only some of them actually deliver the correct data, just late. A lot of systems will abort the execution and move on with zeros/last computed data instead, or drop the data entirely. A modern network AQM system like CAKE uses both delayed scheduling and intelligent dropping.

Agreed though, "just" is hiding quite a deep rabbit hole.

bluGill•22h ago
While you don't need performance guarantees for most things, you still need performance. You can safely let "a small number" of requests "take too long", but if you let "too many" your users will start to complain and go elsewhere. Of course everything in quotes is fuzzy (though sometimes we have very accurate measures for specific things), but you need to meet those requirements even if they are not formal.
amw-zero•23h ago
This is the single most impactful blog post I've read in the last 2-3 years. It's so obvious in retrospect, but it really drove the point home for me that functional correctness is only the beginning. I personally had been over-indexing on functional correctness, which is understandable since a reliable but incorrect system isn't valuable.

But, in practice, I've spent just as much time on issues introduced by perf / scalability limitations. And the post thesis is correct: we don't have great tools for reasoning about this. This has been pretty much all I've been thinking about recently.

adamddev1•23h ago
There could be more linear and "resource-aware" type systems coming down the pipes through research. These would allow the type checker to show performance / resource information. Check out Resource Aware ML.

https://www.raml.co/about/

https://arxiv.org/abs/2205.15211

amw-zero•17h ago
Super interesting, but I think this will be very difficult in practice due to the gigantic effect of nondeterminism at the hardware level (caches, branch prediction, out of order execution, etc.)
Ericson2314•23h ago
The author should try some more modern formal methods.

Tools like Lean and Rocq can do arbitrary math — the limit is your time and budget, not the tool.

These performance questions can be mathematically defined, so it is possible.

ted_dunning•22h ago
Indeed.

And the SeL4 kernel has latency guarantees based on similar proofs (at considerable cost)

adamddev1•23h ago
There is a bunch of research happening around "Resource-Aware" type theory. This kind of type theory checks performance, not just correctness. Just like the compiler can show correctness errors, the compiler could show performance stats/requirements.

https://arxiv.org/abs/2205.15211

Already we have Resource Aware ML which

> automatically and statically computes resource-use bounds for OCaml programs

https://www.raml.co/about/

Lights and Shadows

https://ciechanow.ski/lights-and-shadows/
128•kg•5d ago•16 comments

The Jeff Dean Facts

https://github.com/LRitzdorf/TheJeffDeanFacts
31•ravenical•1h ago•6 comments

Project Patchouli: Open-source electromagnetic drawing tablet hardware

https://patchouli.readthedocs.io/en/latest/
315•ffin•9h ago•32 comments

A closer look at a BGP anomaly in Venezuela

https://blog.cloudflare.com/bgp-route-leak-venezuela/
238•ChrisArchitect•7h ago•117 comments

Show HN: DeepDream for Video with Temporal Consistency

https://github.com/jeremicna/deepdream-video-pytorch
10•fruitbarrel•1h ago•2 comments

Open Infrastructure Map

https://openinframap.org
252•efskap•11h ago•56 comments

Kernel bugs hide for 2 years on average. Some hide for 20

https://pebblebed.com/blog/kernel-bugs
212•kmavm•12h ago•91 comments

The Napoleon Technique: Postponing things to increase productivity

https://effectiviology.com/napoleon/
147•Khaine•3d ago•72 comments

Eat Real Food

https://realfood.gov
951•atestu•21h ago•1277 comments

I program without syntax highlighting

https://hakon.gylterud.net/opinion/syntax-highlighting.html
30•weeber•2d ago•31 comments

Mothers (YC X26) Is Hiring

https://jobs.ashbyhq.com/9-mothers
1•ukd1•2h ago

Shipmap.org

https://www.shipmap.org/
683•surprisetalk•23h ago•108 comments

The price of fame? Mortality risk among famous singers

https://jech.bmj.com/content/early/2025/11/30/jech-2025-224589
5•ingve•4d ago•0 comments

Anyone have experiences with Audio Induction Loops?

https://en.wikipedia.org/wiki/Audio_induction_loop
39•evolve2k•3d ago•21 comments

Go.sum is not a lockfile

https://words.filippo.io/gosum/
110•pabs3•10h ago•43 comments

Tailscale state file encryption no longer enabled by default

https://tailscale.com/changelog
316•traceroute66•18h ago•122 comments

Lessons from Hash Table Merging

https://gist.github.com/attractivechaos/d2efc77cc1db56bbd5fc597987e73338
43•attractivechaos•6d ago•13 comments

ChatGPT Health

https://openai.com/index/introducing-chatgpt-health/
341•saikatsg•19h ago•451 comments

The Q, K, V Matrices

https://arpitbhayani.me/blogs/qkv-matrices/
153•yashsngh•1d ago•62 comments

How Did TVs Get So Cheap?

https://www.construction-physics.com/p/how-did-tvs-get-so-cheap
50•thelastgallon•1h ago•71 comments

LaTeX Coffee Stains (2021) [pdf]

https://ctan.math.illinois.edu/graphics/pgf/contrib/coffeestains/coffeestains-en.pdf
359•zahrevsky•23h ago•86 comments

The virtual AmigaOS runtime (a.k.a. Wine for Amiga:)

https://github.com/cnvogelg/amitools/blob/main/docs/vamos.md
91•doener•13h ago•21 comments

Play Aardwolf MUD

https://www.aardwolf.com/
149•caminanteblanco•15h ago•72 comments

How Google got its groove back and edged ahead of OpenAI

https://www.wsj.com/tech/ai/google-ai-openai-gemini-chatgpt-b766e160
167•jbredeche•22h ago•206 comments

Musashi: Motorola 680x0 emulator written in C

https://github.com/kstenerud/Musashi
96•doener•13h ago•9 comments

GLSL Web CRT Shader

https://blog.gingerbeardman.com/2026/01/04/glsl-web-crt-shader/
80•msephton•3d ago•30 comments

NPM to implement staged publishing after turbulent shift off classic tokens

https://socket.dev/blog/npm-to-implement-staged-publishing
185•feross•20h ago•88 comments

US will ban Wall Street investors from buying single-family homes

https://www.reuters.com/world/us/us-will-ban-large-institutional-investors-buying-single-family-h...
940•kpw94•19h ago•940 comments

Claude Code CLI was broken

https://github.com/anthropics/claude-code/issues/16673
150•sneilan1•18h ago•153 comments

Creators of Tailwind laid off 75% of their engineering team

https://github.com/tailwindlabs/tailwindcss.com/pull/2388
1329•kevlened•22h ago•751 comments