My first verified imperative program

https://markushimmel.de/blog/my-first-verified-imperative-program/

181•TwoFx•7mo ago

Comments

Joker_vD•7mo ago

Naturally, this proof only works for arbitrary-precision integers: when you use fixed-precision integers, the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1] or (if you insist on C semantics) [UINT_MAX, 1].

Hopefully the proof would break if one tried to transfer it over?

Jtsummers•7mo ago

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

`INT_MIN + -1` is not 0 so it should report false in that case.

For UINT_MAX, the algorithm would need to be reconsidered, though, since it's written with signed integers in mind.

> Hopefully the proof would break if one tried to transfer it over?

Hopefully. The proof would have to be modified to account for the actual types. If you're using bounded integers you'd need to write a different proof.

derdi•7mo ago

> For UINT_MAX, the algorithm would need to be reconsidered, though, since it's written with signed integers in mind.

The algorithm is written assuming that unary - produces the additive inverse. That is also true for C's unsigned integers. -1U == UINT_MAX, -UINT_MAX == 1U. It Just Works.

junon•7mo ago

INT_MIN - 1 is undefined behavior in C.

zelphirkalt•7mo ago

Wait, so much effort and it doesn't even consider this widely known issue? That would mean, that even though all this effort has been spent, a decent programmer still has a better idea of whether something is correct than the proof system used here. And worse this might lull one into thinking, that it must be correct, while actually for a simple case it breaks.

Jtsummers•7mo ago

> a decent programmer still has a better idea of whether something is correct than the proof system used here.

The proof is correct in the language it's written for, Lean. If you change the context (axioms) of a proof then the proof may be invalidated. This is not a surprising thing to anyone who spends a second thinking about it.

Joker_vD•7mo ago

> anyone who spends a second thinking about it.

Except most programmers don't spend even a second to think about it, and we end up with "int mid = (low + high) / 2;" bugs in standard implementations of binary search in e.g. Java. And that implementation even had a written proof accompanying it!

Jtsummers•7mo ago

Try that in SPARK/Ada. It'll stop you there [if it can't prove that low + high won't overflow]. Don't take a proof written with one set of assumptions (in this case how integers are expected to behave) and translate it to another language where those assumptions don't hold.

zelphirkalt•7mo ago

Are writing our next program in Lean then? Where does that run?

There seems to be a fundamental difficulty here. Either we prove things in the language we want to use, which means modelling the behavior of the things we use in that language, or we prove things in Lean, but then cannot apply that to an actual implementation, because of issues like the one above.

I would be surprised, if there was no standard approach for modelling bounded integers and their specific properties in a language (which can differ) in a proof language like this. There must have been more people having thought about this and come up with solutions.

bollu•7mo ago

yes, Lean is executable, and the proof of natural numbers runs with arbitrary width integers. they're stored as tagged pointers, with upto 63bit numbers being normal numbers, and larger numbers become GMP encoded.

Jtsummers•7mo ago

> Are writing our next program in Lean then? Where does that run?

That first question is hard to parse. If you mean "Are you writing your next program in Lean then?" then: No, but in principle we could, it runs on each OS we use (Windows and Linux, standard x64 hardware). If you mean something else, I can't figure out what it would be.

> Either we prove things in the language we want to use

Sure, I mentioned SPARK/Ada. There are systems for C, Java, and others that also work and understand their types so you don't have to add extra modeling.

> which means modelling the behavior of the things we use in that language

It would already be done for you, you wouldn't have to model Ada's integers in SPARK, for instance.

> we prove things in Lean, but then cannot apply that to an actual implementation, because of issues like the one above.

https://lean-lang.org/doc/reference/latest/Basic-Types/Fixed...

If you knew your target system was using fixed-width integers, you'd use this.

MobiusHorizons•7mo ago

What hardware doesn’t use fixed width integers?

Jtsummers•7mo ago

What's the context of your question? Did I say otherwise?

Some languages (what's being discussed here) give you arbitrary precision integers, like Lean. So the proof in the blog applies to Lean and any issues with things like -INT_MIN not existing isn't a factor in the proof. That's what's being discussed. The proof for C or Ada or something else with fixed-width integers will be different (and the algorithm, likely) to account for that difference and any other relevant differences.

MobiusHorizons•7mo ago

> If you knew your target system was using fixed-width integers, you'd use this.

Usually `system` in this context refers to the machine you run the program on, not the language itself, or at least that is how I interpreted it. I guess there used to be systems with BCD math supported by hardware, and I guess those could have variable precision. I'm not sure if it was arbitrary (ie it could still overflow), but one could presumably detect overflow and increase the memory allocation. But once you have a software system doing your math rather than a machine instruction it seems like you need to prove the correctness of that implementation not just the general concept.

fc417fc802•7mo ago

A language can provide arbitrary width integers while nonetheless being implemented on top of fixed width integers. In fact I'd say that's pretty typical.

MobiusHorizons•7mo ago

I mean sure, but at that point it's a library, so don't you need to prove the correctness of that library as well?

fc417fc802•7mo ago

To the same extent that you need to prove the correctness of the compiler and the language runtime I guess. Arbitrary width integers are a core feature of quite a few languages.

Given Lean's raison d'etre it wouldn't surprise me if the relevant library code was verified. If using another language such as Python I think it would be fairly reasonable to take the underlying implementation on faith. If we're going to start questioning whether a core part of the runtime that implements basic arithmetic operations has bugs then why are we not also suspecting the operations provided by the CPU and other hardware?

We probably should suspect those things, but as a matter of practicality we're going to need a boundary at present.

MobiusHorizons•7mo ago

Fair enough. I must just not be understanding what the motivation for creating proofs of such software is then. I guess to me assuming the compiler or language runtime have no bugs seem to severely undermine the value of the proof. I feel like if the proof was to have value in an imperative language like C or C++ I would want the assembly proven, as the resulting computation can be drastically altered from the original source text between various compiler flags and optimization levels.

aseipp•7mo ago

It does not look like it due to its largest target audience (hardcore math nerds writing proofs), but Lean 4 is actually a meta-programming language, like Racket -- it is one single unified language for programming, meta programming, and theorem proving, on top of a small core. The implementation of Lean is almost entirely written in Lean, including components like LSP servers, the compiler itself, many efficient data structures, etc. It is a key use case to write real world programs, because the implementation itself is such a program.

In the long past, Lean 3 had this idea that you could use one language both for writing proofs, and writing proof automation -- programs that manipulate proofs and automatically do things. Like in the article itself, the 'grind' thing-a-mabob is proof automation. (Roqc has two separate languages for proofs and proof automation.) But there was a problem: Lean started as a theorem prover and the implementation tried to move towards "executing programs" and it didn't work very well and was slow. The prototype compiler from Lean 3 to C++ ran out of steam before Lean 3 got canned.

Lean 4 instead went and did things the other way around: it started as a programming language that was executable. The Lean 4 compiler was self-hosted during development for a long-time, way before anyone ported any big math proofs to it from Lean 3. Why did they do this? Because the key insight is that if you want to write programs that manipulate proofs (AKA programs), the best thing to have at hand is have a robust general programming language -- like Lisp or Racket. And so Lean is macro based, too, and like Racket it allows you to have families of languages and towers of macros that expand as deep as you want. Meta programs that write programs that write programs, etc...

So in Lean you write Lean programs that can manipulate Lean ASTs, and you write "reader macros" that allow you to use domain specific syntax right inside the source file for any kind of math or programming-language DSL you want. And all those macros and meta-programs are compiled and executed efficiently just like you'd expect. Finally, there is a total fragment of the language that you can actually write proofs over and reason about. And there is a big library called "mathlib" with lots of macros for writing math and math proofs in this language-framgent-we-can-reason-and-prove-things-with.

Lean 4 is very close in spirit to the Lisp idea, but modified for math proving and generalized programming. It's very unique and powerful.

ykonstant•7mo ago

Lean is actually pretty easy to learn and code in; also, if you are not interested in extreme performance, it is pretty much ready for basic applications (not sure about async and co.).

The problem is that you want to be doing all your coding monadically (so that it looks like the familiar imperative style), and I think there are still many problems about doing proofs on monadic code.

touisteur•7mo ago

The most interesting part of SPARK for me is how all the runtime checks are implicitly, automatically generated and need to be proved.

A whole bunch of assumptions about the language are hidden and the verification conditions are generated automatically for the underlying automatic or semi-automatic proof tools (why3-intermediated) - the second best part of SPARK.

You have to trust that the SPARK FrontEnd makers got it right - or you can inspect/review all the discharged VC if you want - and you still have to actually prove or help prove them all, but I'm not losing sleep over a forgotten 'normal' check.

lmm•7mo ago

> Except most programmers don't spend even a second to think about it, and we end up with "int mid = (low + high) / 2;" bugs in standard implementations of binary search in e.g. Java. And that implementation even had a written proof accompanying it!

I see this as far more of an indictment of Java than an indictment of programmers. If you make the symbol "+" part of your programming language's syntax (especially in a language where you don't allow custom implementations of "+") then it should do what the ordinary mathematical + operation does!

cjfd•7mo ago

Do you realize that making + do what the ordinary mathematical + does, is quite expensive? It requires arbitrary precision arithmetic and hence, potentially, memory allocations.

lmm•7mo ago

> Do you realize that making + do what the ordinary mathematical + does, is quite expensive? It requires arbitrary precision arithmetic and hence, potentially, memory allocations.

Yes I do. It's an acceptable cost in most cases (you will note much of the world runs on Python, a language in which + does behave sensibly, these days), and dangerous optimisations should be opt-in rather than opt-out.

In the worst case you should at least make your language fail-stop definite by making it error when the result of + is too large rather than silently continuing in an invalid state.

Tainnor•7mo ago

> In the worst case you should at least make your language fail-stop definite by making it error when the result of + is too large rather than silently continuing in an invalid state.

That's what Swift does fwiw. But there are downsides to that too (although they're also related to Swift not having a good fault recovery mechanism).

In practice, whenever you know you're going to use large enough numbers in Java, you probably want to use BigInteger or BigDecimal (e.g. in banking). Unfortunately, Java doesn't allow you to use the + operator with it.

lmm•7mo ago

> whenever you know you're going to use large enough numbers in Java, you probably want to use BigInteger or BigDecimal

When you've made a conscious decision you can pick the right thing, sure. The problem usually happens when you haven't thought about it at all, which is why the default should be safe and the unsafe optimisation should be opt-in.

Tainnor•7mo ago

But "safe" and "unsafe" are not a binary thing.

Maybe for LOB applications, it would be better if languages defaulted to arbitrary precision arithmetic. But scientific computing is also a huge field that often uses the same languages and there, arbitrary precision is often the completely wrong tool, e.g. it would make certain key algorithms (like Gaussian elimination) exponential.

I feel like this is just one of these things that developers should know about so they can make the correct choice, just like DB indices.

lmm•7mo ago

> arbitrary precision is often the completely wrong tool, e.g. it would make certain key algorithms (like Gaussian elimination) exponential.

Sure, but even in science taking a lot of time to deliver a result, or failing to deliver one at all, is much safer than silently delivering the wrong result. There's the concept of fail-stop if you want a rigorous approach to safety. There's no analogous safety model that says silent overflow is the safe option.

DavidVoid•7mo ago

> the algorithm will wrongfully report "false" for arrays like e.g. [INT_MIN, -1]

If you have INT_MIN along with any other negative number in the array then your program has undefined behavior in C. Signed integer overflow is UB (but unsigned overflow is not).

addaon•7mo ago

> If you have INT_MIN along with any other negative number in the array then your program has undefined behavior in C.

What? Why? There’s no addition needed to solve this problem. The example implementation does invert each element, which is undefined for INT_MIN, but it would be trivial to just skip INT_MIN elements (since their additive inverse is never in the set).

raphlinus•7mo ago

Yes. The problem here is the -x operation. If INT_MIN is in the array, then the negation operation itself is UB. As you say, the fix is to skip values equal to INT_MIN; it's not possible that its negation is in the array, as that number is not representable.

Rust is only a little better. With default settings, it will panic if isize::MIN is in the input slice in a debug build, and in a release build will incorrectly return true if there are two such values in the input. But in C you'll get unicorns and rainbows.

addaon•7mo ago

> But in C you'll get unicorns and rainbows.

But in formally verified C you’ll have the algorithm that’s correct for that type (skip INT_MIN), or you won’t have a proof.

ykonstant•7mo ago

But I like unicorns and rainbows! ...unless my name is Charlie 「(°ヘ°)

andrepd•7mo ago

But false is the correct result for those cases. Addition is addition and overflow is undefined (= can assume that doesn't happen), it's not addition modulo 2^n.

Joker_vD•7mo ago

We are not talking about C here. Imagine it was e.g. Java, or C#, or Rust in release mode, or heck, even Lean itself but with fixed-precision integers.

alternatex•7mo ago

Overflow/underflow does happen in C# though. You have to manually opt-out by using a `checked` block anywhere you do arithmetic.

necunpri•7mo ago

This is the strength of typing, right?

If I can specify the type of my input I can ensure the verification.

sureglymop•7mo ago

How so? If you specify the type int how would that save you from an overflow/underflow at runtime?

alternatex•7mo ago

Not sure if types are supposed to protect against overflow/underflow, it's more in the arithmetic territory. Interesting idea though, I wonder if some programming language allows overflow/underflow checks directly in the types.

Joker_vD•7mo ago

Yes, it's actually quite easy:

    +: (int<N>, int<N>) -> int<N+1>    // or (int<N>, bit) as the return type
    *: (int<N>, int<N>) -> int<2*N>

etc. Those are the actual types of those arithmetic operations.

ethan_smith•7mo ago

Good point about overflow - Lean can actually model machine integers with bounded arithmetic operations, allowing you to formally verify these edge cases by explicitly reasoning about overflow behavior.

rck•7mo ago

Do you know of any short examples of this? Yesterday I was trying to prove some "easy" theorems that involved machine number representations, and I couldn't find anything in Lean.

SUPERX_112•7mo ago

nice

revskill•7mo ago

U must have at least 5 characters to get an upvote.

munchler•7mo ago

Lean is awesome and this is an impressive new feature, but I can't help but notice that the proof is significantly longer and more complex than the program itself. I wonder how well this will scale to real-world programs.

grumbelbart•7mo ago

Long-term this would be done using LLMs. It would also solve LLMs' code quality issues - they could simply proof that the code works right.

codebje•7mo ago

Maybe very long term. I turn off code assistants when doing Lean proofs because the success rate for just suggestions is close to zero.

photonthug•7mo ago

> simply proof that the code works right

Combining LLMs + formal methods/model checkers is a good idea, but it's far from simple because rolling the dice on some subsymbolic stochastic transpiler from your target programming language towards a modeling/proving language is pretty suspect. So suspect in fact that you'd probably want to prove some stuff about that process itself to have any confidence. And this is a whole emerging discipline actually.. see for example https://sail.doc.ic.ac.uk/software/

amw-zero•7mo ago

Research points to there being a quadratic relationship between automated proof and code size: https://trustworthy.systems/publications/nictaabstracts/Mati....

Specifically, the relationship is between the _specification_ and the proof, and it was done for proofs written in Isabelle and not Lean.

The good news is that more and more automation is possible for proofs, so the effort to produce each proof line will likely go down over time. Still, the largest full program we've fully verified is much less than 100,000 LOC. seL4 (verified operating system) is around 10,000 lines IIRC.

tsimionescu•7mo ago

A small note, but seL4 is more of a kernel than a full OS - though in the embedded world for which it is targeted, the difference is pretty fuzzy.

armchairhacker•7mo ago

Real-world programs can be verified by formally proving properties on a small part of the code (called the kernel) in a way that transitively guarantees those for the remaining code.

For example, Rust's borrow checker guarantees* memory safety of any code written in Rust, even a 10M+ LOC project. Another example is sel4, a formally-verified micro-kernel (https://sel4.systems/About/seL4-whitepaper.pdf).

* Technically not; even if the code doesn't use `unsafe`, not only is Rust's borrow checker not formally verified, there are soundness holes (https://github.com/rust-lang/rust/issues?q=is%3Aopen%20is%3A...). However, in theory it's possible to formally prove that a subset of Rust can only encode memory-safe programs, and in practice Rust's borrow checker is so effective that a 10M+ LOC project without unsafe still probably won't have memory issues.

serbuvlad•7mo ago

What's a memory issue?

If I access beyond the end of an array in Rust, the panic handler runs and starts unwinding my stack. If I access beyond the end of an array in C++ with .at() the excwption handler runs and starts unwining my stack. If I access beyond the end of an array in C the SIGSEGV handler may (*) run and I could, if I wanted to, start unwinding my stack.

Ah, but in C, sometimes if I access the wrong memory, I get garbadge instead of a panic.

Sure, and if I store my data in a Rust array and store indexes into that array around the place as sort of weak references (something I've seen Rust programmers use and talk about all the time), I can easily fetch the wrong data too.

Rust provides a robust type system and a borrow checker which avoids a lot of common problems at the expence of adhering to a particular programming style. That's fine. That's worth advocating for.

But it's no pannacea. Not even close.

My favorite memory about this is a programmer lambasting Go's strings (which are basically immutable byte vectors) for not enforcing UTF-8, like Rust strings.

He then said that this means that in Go you can print filenames to the screen that can break your terminal session because of this if they contain invalid UTF-8, which Rust forces you to escape explicitly. The irony, of couse, is that the characters that can break your terminal session are perfectly valid UTF-8.

Rust's type safety convinced this guy that his Rust program was immune to a problem that it was simply not immune to.

caim•7mo ago

Funny thing is that you can get undefined behavior and segfaults using only "safe rust", and the rust compiler has subtle bugs that allow you to disable important checks (like type checking), which can leave your code completely broken.

But for some crazy propaganda, rust devs believes that any rust code is safe and sound no matter what.

https://github.com/Speykious/cve-rs/issues/49

lmm•7mo ago

> Sure, and if I store my data in a Rust array and store indexes into that array around the place as sort of weak references (something I've seen Rust programmers use and talk about all the time), I can easily fetch the wrong data too.

Maybe, but you'll do so in an understandable way that you can catch in testing. You won't suddenly start fetching different wrong data when someone builds your program with a newer version of the compiler, which is a very real risk with C. To say nothing of the risk of arbitrary code execution if your program is operating on attacker-supplied data.

> The irony, of couse, is that the characters that can break your terminal session are perfectly valid UTF-8.

Terminals that can't handle the full UTF-8 range are a problem with those terminals IMO. And terminals implemented in Rust probably don't have that problem :).

tsimionescu•7mo ago

> Terminals that can't handle the full UTF-8 range are a problem with those terminals IMO. And terminals implemented in Rust probably don't have that problem :).

No, it isn't, and yes, they would. The problem is that the terminal accepts certain valid UTF-8 characters (typically from the ASCII subset) as output control characters. This is how you get things like programs that can output colored text.

This is a part of the fundamental design of how a terminal device is supposed to work: its input is defined to be a single stream of characters, and certain characters (or sequences of characters) represent control sequences that change the way other characters are output. The problem here is with the design of POSIX in general and Linux in particular - the fact that, despite knowing most interaction will be done through a terminal device with no separate control and data channels, they chose to allow control characters as part of file names.

As a result of this, it is, by design, impossible to write a program that can print out any legal file name to a terminal without risking to put the terminal in a display state that the user doesn't expect. Best you could do is recognize terminal control sequences in file names, recognize if your output device is a terminal, and in those cases print out escaped versions of those character sequences.

fc417fc802•7mo ago

I don't think it's Linux so much as it is any given filesystem implementation. As I understand it validation is entirely up to the filesystem itself. I could be mistaken but I don't believe there's anything stopping you from implementing a filesystem that uses raw binary data for filenames.

There's also the question of what happens if the data structures on disk become corrupted. The filesystem driver might or might not validate the "string" it reads back before returning it to you.

tsimionescu•7mo ago

Linux itself exposes various syscalls that operate with filenames, userland programs can't interact directly with the FS driver. But Linux chose to implement only 2 restrictions at the syscall level (slash used to separate elements of the path and NULL used to mark the end of the input). The kernel will resolve the path to a particular file system and send a subset of the path to the corresponding FS driver exactly as it received them, and the FS can choose whether to accept or reject them. Most Linux FSs don't apply any extra restrictions either. The main exceptions are FSs written to interface with other systems, such as CIFS or SMB, which additionally apply DOS/Windows filename restrictions by necessity.

If Linux had chosen to standardize file names at the syscall level to a safe subset of UF-8 (or even ASCII), FS writers would never have even seen file names that contain invalid sequences, and we would have been spared from a whole set terminal issues. Of course, UTF-8 was nowhere close to as universally adopted as it is today at the time Linux was developed, so it's just as likely they might have standardized to some subset of UTF-16 or US-ASCII and we might have had other kinds of problems, so it's arguable they took the right decision for that time.

lmm•7mo ago

> the terminal accepts certain valid UTF-8 characters (typically from the ASCII subset) as output control characters. This is how you get things like programs that can output colored text.

The terminal should not allow such a sequence to break it. Yes, being able to output colour is desirable, but it shouldn't come at the cost of breaking, and doesn't need to. (Whereas it is much less unreasonable for a terminal to break when it's sent invalid UTF-8).

> This is a part of the fundamental design of how a terminal device is supposed to work: its input is defined to be a single stream of characters, and certain characters (or sequences of characters) represent control sequences that change the way other characters are output.

"Design" and "supposed to" are overstating things. It's a behaviour that some terminals and some programs have accreted.

> it is, by design, impossible to write a program that can print out any legal file name to a terminal without risking to put the terminal in a display state that the user doesn't expect

I would not say by design, and I maintain that the terminal should handle it.

tsimionescu•7mo ago

I believe you're misunderstanding the problem. The terminal doesn't "break" in the sense that it crashes or does something undefined for those cases. The terminal is doing something that is completely meaningful and well defined and probably has some realistic use cases, such as switching to a different character encoding.

The only problem is that it's not what the user wanted to happen. For a simple example, if a file name contains the control sequence for starting a block of red text, and you print that file name as is in a terminal, you'll, (1), see a truncated file name (that is, copying the text from terminal will not give you the actual file name, since the control characters will be entirely missing), and (2) all future text will be red.

The terminal has done nothing wrong in this case: it used its normal logic for turning text red. The file name is not in any way wrong - it's a completely valid Linux and ext4 file name. The program is not necessarily doing anything wrong - perhaps it was never designed to print to a terminal. But the overall interaction produces the wrong results.

lmm•7mo ago

> I believe you're misunderstanding the problem. The terminal doesn't "break" in the sense that it crashes or does something undefined for those cases. The terminal is doing something that is completely meaningful and well defined and probably has some realistic use cases, such as switching to a different character encoding.

I'm aware of the details, but I think sometimes that knowledge leads people to miss the forest for the trees. If the user perceives the terminal as having "broken", that's a case of poor UX design at a minimum. Given that users can readily distinguish between legitimate coloured output etc. and terminals getting into a poor state, it really shouldn't be too hard for the terminal itself to do so. (E.g. it's pretty normal for today's terminals to display some kind of visible warning (complete with resume button) when you press Ctrl-S, rather than simply silently stopping). And while this is a much fuzzier and more contentious claim, I think the Rust community's mentality (as seen in e.g. their approach to compiler errors) nudges people towards such approaches.

tsimionescu•7mo ago

This is an entirely new claim - that the Terminal should try to understand what its input means. You can go ahead and try to implement this - I think you'll easily see that it's extremely hard to do so.

A terminal is basically a function foo(char_stream) = formatted_char_stream. It has no idea whatsoever what the input means, or what the output is supposed to mean. Your Ctrl-S example is completely in line with this: it's one control code that the Terminal chooses to display/interpet in a certain way (older terminals would just stop, newer terminals display some warning text and wait for user input). Recognizing that the start-red-output control sequence should not be interpreted as a control sequence if it's coming from the output of `ls` is a fundamentally different type of change.

Would it be nice to have a different concept, a CmdDisplayer that takes as input (commands, command text, control text) and outputs formatted text while understanding its input? Maybe. But it wouldn't be a terminal, and it would require a fundamental redesign of every single program that wants to meaningfully interact with it - especially all shells and any TUI program that might make the most use of such a tool.

lmm•7mo ago

> This is an entirely new claim - that the Terminal should try to understand what its input means.

I don't see it as a different claim. My position is that the terminal should not move into a bad state for any valid input, and that a state that the user understands as being "broken" is probably a bad state. To my mind this should not require detailed understanding of what each program is doing, which I agree is difficult, but just some more basic things like making it clearer to the user why their terminal is in a funny state (and how to undo it), or perhaps ensuring that the terminal reverts to a good state whenever a program that had changed its state exits.

tsimionescu•7mo ago

Having a way to return to the default would be nice, but it wouldn't fix the base problem that a program can't simply print a file name to stdout if it thinks stdout might be connected to a terminal. Even if the terminal didn't "break", it would still not display the file name in a useful way (e.g. a user couldn't copy paste it from there).

lmm•6mo ago

> it wouldn't fix the base problem that a program can't simply print a file name to stdout if it thinks stdout might be connected to a terminal. Even if the terminal didn't "break", it would still not display the file name in a useful way (e.g. a user couldn't copy paste it from there).

Well, I'm all for having something like a "raw copy" function in the terminal, but if a filename contains characters that can't reasonably be copy-pasted then there's not really anything that can be done about that (other than "don't give your files silly names"). I think having the terminal get into a bad state is a far worse problem; weird/corrupted filenames are something that happens and people generally figure out a way to deal with them.

fc417fc802•7mo ago

It's even worse than that. If a newer version of the compiler is able to leverage knowledge of the array bounds it could "optimize" away an entire chunk of your program. It probably won't do that because compiler authors supposedly aren't openly hostile towards compiler users but it isn't so easy to write an algorithm that will flag such "obviously" wrong things.

The control characters are themselves valid (but unprintable) UTF-8. They are also, against all common sense and reason, permitted within filenames by many filesystems. Rust won't save you here. https://www.compart.com/en/unicode/category/Cc

serbuvlad•7mo ago

> Maybe, but you'll do so in an understandable way that you can catch in testing. You won't suddenly start fetching different wrong data when someone builds your program with a newer version of the compiler, which is a very real risk with C. To say nothing of the risk of arbitrary code execution if your program is operating on attacker-supplied data.

What guarantees it? Literally nothing. You can catch errors in testing in C as well. Yeah, in C you get "different" data on a different version of the compiler, but you get garbage data in all versions of the compiler and Valgrind flags that in testing.

And, of course, you can get arbitrary code execution if your program is operating on data coming from multiple users of different privilege levels in Rust if you use vectors like that.

Sure, Rust fixes a lot of easy to commit bugs in C and C++, absolutely. But there are no absolutes.

lmm•7mo ago

> What guarantees it? Literally nothing.

The behaviour of looking up index x in array y in rust is well-defined, and consistent between compiler versions. Maybe you use the wrong index and get the wrong customer's data or something, but you'll still get the data at that index (or a panic if the index is invalid).

> You can catch errors in testing in C as well. Yeah, in C you get "different" data on a different version of the compiler, but you get garbage data in all versions of the compiler and Valgrind flags that in testing.

Not always. It's very common for code to be broken according to the standard but do the right thing in some compilers, and then in a different compiler it does something completely bizarre. The code might not do the same thing in testing as it does in release. And while Valgrind does a lot of good for the minority of C programmers who use it, it's far from 100% reliable.

> you can get arbitrary code execution if your program is operating on data coming from multiple users of different privilege levels in Rust if you use vectors like that.

You might in some cases, but it's a lot tricker. You can get the program to operate on different parts of its data than it was intended to, but to go from there to running arbitrary code will still be a significant leap and require an exploitation technique specific to that particular program. Whereas the techniques for going from common classes of C vulnerabilities (e.g. buffer overflow or use after free) to arbitrary code execution are practically textbook at this point.

rtpg•7mo ago

My belief is that the core part of a lot of programs where you want formal proofs are tricky enough to where you need hand holding but the higher up the stack you go the simpler your proofs get since you did the hard stuff in the core.

Though it really does feel like we're still scratching the surface of proof writing practices. A lot of proofs I've seen seem to rely only on very low level building blocks, but stronger practitioners more immediately grab tools that make stuff simpler.

I would say, though, that it feels likely that your proofs are always going to be at least within an order of magnitude of your code, because in theory the longer your code is the more decision points there are to bring up in your proof as well. Though automatic proof searches might play out well for you on simpler proofs.

tsimionescu•7mo ago

> My belief is that the core part of a lot of programs where you want formal proofs are tricky enough to where you need hand holding but the higher up the stack you go the simpler your proofs get since you did the hard stuff in the core.

I don't think this is true at all. For many kinds of programs that it would be good to have formal verification for, all of the details are very important. For example, I'd love to know that the PET scan code was formally verified, and that it's impossible to, say, show a different radiation dose on the screen than the dose configured in the core. I very much doubt that it's easy to write a proof that the GUI controls are displaying the correct characters from a font, though.

Or, it would be good to know that the business flows in the company ERP are formally verified to actually implement the intended business processes, but even the formal specification for the business processes in a regular large company would likely take longer to produce than it takes for those same business processes or external laws to change.

norir•7mo ago

My brain has been slowly trained to reject imperative programming. This example could be rewritten in a tail recursive manner using an immutable set which would be simpler to verify for correctness even without a formal verifier.

I have found that while there is a learning curve to programming using only recursion for looping, code quality does go significantly up under this restriction.

Here is why I personally think tail recursion is better than looping: with tail recursion, you are forced to explicitly reenter the loop. Right off the bat, this makes it difficult to inadvertently write an infinite loop. The early exit problem is also eliminated because you just return instead of making a recursive call. Moreover, using recursion generally forces you to name the function that loops which gives more documentation than a generic for construct. A halfway decent compiler can also easily detect tail recursion and rewrite it as a loop (and inline if the recursive function is only used in one place) so there need not to be any runtime performance cost of tail recursion instead of looping.

Unfortunately many languages do not support tail call optimization or nested function definitions and also have excessively wordy function definition syntax which makes loops more convenient to write in those languages. This conditions one to think in loops rather than tail recursion. Personally I think Lean would be better if it didn't give in and support imperative code and instead helped users learn how to think recursively instead.

kevindamm•7mo ago

Which languages do support TCO at this point? From my recollection we have

* Scheme

* Haskell

* Elixir

* Erlang

* OCaml

* F#

* Scala

* (not Clojure)

* the JVM could remove tail-recursive calls, but IIRC this still hasn't been added for security reasons

* Racket

* Zig

* Lua

* Common Lisp, under certain compilers/interpreters

* Rust? (depends)

* Swift? (sometimes)

louthy•7mo ago

> * F#

The .NET CLR supports the ‘.tail’ opcode which means that any .NET based language could support it. I’m hoping one day the C# team will get around to it. It seems like such low hanging fruit.

nhubbard•7mo ago

Kotlin as well, through the ‘tailrec’ marker on a function.

kevindamm•7mo ago

ah, thanks, good to know.. but does that make it optional? I kind of like how ocaml requires a letrec annotation on any recursive definition and I don't know when you wouldn't want to add tailrec

alexisread•7mo ago

Freeforth (implicit) and Ableforth (deliberately explicit)

taeric•7mo ago

I don't understand the security reasons on not removing tail calls. Any chance you have a good place to read up on that?

kevindamm•7mo ago

It was raised in one of the initial proposals, back in 2002

https://bugs.java.com/bugdatabase/view_bug?bug_id=4726340

but that looks like a dead link and no wayback archive..

IIRC, basically it's because some parts of the JVM use stack unwinding to figure out what userland code is calling certain system code.. also the current stack frame has metadata about lock status used for allowing re-entrant locks that you lose if you elide the entire recursive call (which the initial proposal did by only removing the few bytecode instructions that set up the callstack frame and return from it).

A more informal proposal from ~2016 allows for soft tail calls and hard (annotated) tail calls, with some restrictions that evidently avoid issues with system calls and lock/reentry maintenance:

https://web.archive.org/web/20161112163441/https://blogs.ora...

And a video by one of the JVM architects at Oracle about adding TCO for Scala

https://www.youtube.com/watch?v=2y5Pv4yN0b0&t=1h02m18s

Also previously featured here on HN, a way to do it that avoids security concerns, by using goto instead of strictly deleting bytecode instructions:

https://news.ycombinator.com/item?id=22945725

zabzonk•7mo ago

C++, depending on compiler and other stuff.

geoffhill•7mo ago

Both Clang and GCC have musttail attributes than can force tail calls at specific return statements in C/C++.

salutis•7mo ago

Also Fennel, both implicitly and explicitly with `tail!`.

Source: https://fennel-lang.org/reference#tail

Quekid5•7mo ago

It's worth noting that some (many?) languages[0] only support TCO as long as you're calling the function itself in tail position. The usual cases were you'll notice this when implementing state machines in "direct style" or when doing continuation-passing style for control flow.

TCO is more general than that in some languages where any function call in tail position can be turned into a direct jump. This obviously requires either 1) runtime support in some form or 2) a non-trivial amount of program transformation during compilation.

[0] Scala's @tailrec is one I'm 100% certain of.

louthy•7mo ago

I agree, but also folds, traversals, list-comprehensions, and recursion-schemes work well and can be even more resistent to common bugs than regular recursion.

Although it’s hard to fault the simple elegance of recursion!

taeric•7mo ago

This feels overly strong? I've certainly messed up my fair share of recursive calls.

I don't know why, but I have actually gotten a bit stronger on the imperative divide in recent years. To the point that I found writing, basically, a GOTO based implementation of an idea in lisp to be easier than trying to do it using either loops or recursion. Which, really surprised me.

I /think/ a lot of the difference comes down to how localized the thinking is. If I'm able to shrink the impact of what I want to do down to a few arguments, then recursion helps a ton. If I'm describing a constrained set of repetitive actions, loops. If I'm trying to hold things somewhat static as I perform different reduction and such, GOTO works.

I think "functional" gets a bit of a massive boost by advocates that a lot of functional is presented as declarative. But that doesn't have to be the case. Nor can that help you, if someone else hasn't done the actual implementation.

We can get a long way with very mechanical transformations, in the form of compilation. But the thinking can still have some very imperative aspects.

tuveson•7mo ago

> I've certainly messed up my fair share of recursive calls.

It’s a common enough problem that the “why is my program crashing” website is basically named after it.

ngruhn•7mo ago

Another case for recursion is that you have to think of the base case. With loops people pathologically forget to handle 0, [], [[]], "", etc.

buzzin_•7mo ago

I have found it that if you invest some time in learning how to write quality for loops, the quality indeed goes up.

Also, when writing for loops, you have to explicitly think about your exit condition for the loop, and it is visible right there, at the top of the loop, making infinite loops almost impossible.

tsimionescu•7mo ago

I think you're in a tiny minority if you think it's easier to read and understand algorithms written using recursive tail calls than imperative loops. Even outside of programming, take a look at most work being done in algorithm research - you'll see that most often algorithms are described in an imperative pseudo-code, and using a mix of loops and regular recursion (not tail recursion), with maybe some mapping and filtering constructs in addition.

Tail recursion in particualr is the least human-friendly way to represent looping. It mixes input and output parameters of the function in any but the most trivial cases. It also forces the caller to figure out what is the base value for all of the output parameters (often requiring a separate function just to provide those to callers). And it basically takes an implementation detail and makes it a part of your function interface.

Even in math, you typically define recursive functions like f(x) = 1 + f(x-1), not f(x, y) = f(x-1, y+1); g(x) = f(x, 0).

immibis•7mo ago

Computers are imperative. With imperative programming you can get better performance than with pure functional. (Using pure functional to wrap an imperative core still counts as imperative)

Maybe you don't care about performance; IMO squeezing performance is one of the important applications for formal verification, as you can prove your fast insane algorithm is correct, whereas the slow obvious one is obviously correct.

If your main concern is clarity, some things are clearer when written imperatively and some when written functionally.

jeremyscanvic•7mo ago

That's really neat! I'm very excited for the future of Lean.

ryjo•7mo ago

Very cool. Neat how you managed to get logical symbols in to the language itself! When might someone use preconditions in Lean theorems?

This article caught my eye because it's focused on imperative programming, and I've been very focused on declarative vs imperative programming over the last few years. I implemented a version of your function in CLIPS, a Rules-based language that takes a declarative approach to code:

(defrule sum-is-0 (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))) => (println TRUE))

(defrule sum-is-not-0 (not (and (list $? ?first $? ?second $?) (test (= 0 (+ ?first ?second))))) => (println FALSE))

(assert (list 1 0 2 -1)) (run) (exit)

The theorem you write in Lean to prove the function kind-of exists in CLIPS Rules; you define the conditions that must occur in order to execute the Right Hand Side of the Rule. Note that the above simply prints `TRUE` or `FALSE`; it is possible to write imperative `deffunction`s that return values in CLIPS, but I wanted to see if I could draw parallels for myself between Lean code and theorems. Here's a gist with the simple version and a slightly more robust version that describes the index at which the matching numbers appear: https://gist.github.com/mrryanjohnston/680deaee87533dfedc74b...

Thank you for writing this and for your work on Lean! This is a concept that's been circling in my head for a minute now, and I feel like this article has unlocked some level of understanding I was missing before.

drdeca•7mo ago

Very nice. However, I wonder whether it might be good to have a way to tell LEAN to spit out a more explicit form of the proof steps it obtained during `grind`? Like, to produce text for what one would put in there if one was doing it manually, that would work in place of grind, in case the grind step is slow to verify?

yuppiemephisto•7mo ago

`show_term grind` and `by grind?` should do what you want

bux93•7mo ago

It looks suspiciously like verifying imperative code by writing declarative code that does the same thing.

The Super Sharp Blade

Smart Homes Are Terrible

What I haven't figured out

KPMG pressed its auditor to pass on AI cost savings

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

First Proof

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

Kagi Translate

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

Tactical tornado is the new default

Full-Circle Test-Driven Firmware Development with OpenClaw

Automating Myself Out of My Job – Part 2

Google staff call for firm to cut ties with ICE

Dependency Resolution Methods

Crypto firm apologises for sending Bitcoin users $40B by mistake

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

There's no such thing as "tech" (Ten years later)

List of unproven and disproven cancer treatments

Me/CFS: The blind spot in proactive medicine (Open Letter)

Ask HN: What are the word games do you play everyday?

Show HN: Paper Arena – A social trading feed where only AI agents can post

TOSTracker – The AI Training Asymmetry

The Devil Inside GitHub

Show HN: Distill – Migrate LLM agents from expensive to cheap models

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

Make a local open-source AI chatbot with access to Fedora documentation

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

Software Factories and the Agentic Moment

The Neuroscience Behind Nutrition for Developers and Founders

Bang bang he murdered math {the musical } (2024)

The Super Sharp Blade

Smart Homes Are Terrible

What I haven't figured out

KPMG pressed its auditor to pass on AI cost savings

Open-source Claude skill that optimizes Hinge profiles. Pretty well.

First Proof

I squeezed a BERT sentiment analyzer into 1GB RAM on a $5 VPS

Kagi Translate

Building Interactive C/C++ workflows in Jupyter through Clang-REPL [video]

Tactical tornado is the new default

Full-Circle Test-Driven Firmware Development with OpenClaw

Automating Myself Out of My Job – Part 2

Google staff call for firm to cut ties with ICE

Dependency Resolution Methods

Crypto firm apologises for sending Bitcoin users $40B by mistake

Show HN: iPlotCSV: CSV Data, Visualized Beautifully for Free

There's no such thing as "tech" (Ten years later)

List of unproven and disproven cancer treatments

Me/CFS: The blind spot in proactive medicine (Open Letter)

Ask HN: What are the word games do you play everyday?

Show HN: Paper Arena – A social trading feed where only AI agents can post

TOSTracker – The AI Training Asymmetry

The Devil Inside GitHub

Show HN: Distill – Migrate LLM agents from expensive to cheap models

Show HN: Sigma Runtime – Maintaining 100% Fact Integrity over 120 LLM Cycles

Make a local open-source AI chatbot with access to Fedora documentation

Introduce the Vouch/Denouncement Contribution Model by Mitchellh

Software Factories and the Agentic Moment

The Neuroscience Behind Nutrition for Developers and Founders

Bang bang he murdered math {the musical } (2024)

My first verified imperative program

Comments