Retrobootstrapping Rust for some reason

https://graydon2.dreamwidth.org/317484.html

105•romac•7h ago

Comments

fcoury•7h ago

Not sure why, but I am getting 403 Forbidden, so if you are getting the same here's an archive.is link https://archive.is/UH5fg

superkuh•7h ago

You're not the only one getting blocked. I emailed dreamwidth about this in the past and they say it's something their upstream network host does and they cannot even fix it if their site users wanted to fix it. They're a somewhat limited and broken host partially repackaging some other company's services.

    >Dreamwidth Studios Support: I'm sorry about the frustrations you're having. The "semi-randomly selected to solve a CAPTCHA" interstitial with a visual CAPTCHA is coming from our hosting provider, not from us: ... and we don't have any control over whether or not someone from a particular network is shown a CAPTCHA or not because we aren't the ones who control the restriction.

This also applies to the 403's.

neilv•5h ago

This needs to be a catchy name, but I don't have a good one. CloudFlaritis? CloudFlareup? (CloudFlareDown?)

Regardless of whether Cloudflare is the particular infra company, the company who uses them responds to blocked people: "We don't know why some users can't access our Web site, and we don't even know the percentage of users who get blocked, but we're just cargo-culting our jobs here, so sux2bu."

The outsourced infra company's response is: "We're running a business here, and our current solution works well enough for that purpose, so sux2bu."

o11c•4h ago

Hmm, "cloudfail" is already in use, and "cloudfuckyou" while descriptive is profane enough that it will cause unnecessary friction with certain people, and "clownflare" is too vague/silly (and is less applicable to other service providers).

So I propose "cloudfart" - just rude enough it can't be casually dismissed, but still tolerable in polite company. "I can't access your website (through the cloudfart |, it's just cloudfarting at me)."

Other names (not all applicable for this exact use): cloudfable, cloudunfair, cloudfalse, cloudfarce, cloudfault, cloudfear, cloudfeeble, cloudfeudalism, cloudflake, cloudfluke, cloudfreeze, cloudfuneral.

neilv•4h ago

Would be nice if the name punished a perpetrator's brand.

Not just sound like we're taking in stride an unavoidable fact of nature.

Want people to stop saying "ClouldFlareup" (like a social disease)? Stop causing it.

tmtvl•37m ago

I'd say Clownflare, but that sits too close to Clown Care, who do really great work.

tmtvl•6h ago

I believe Guix faced some issues in bootstrapping Rust (which ties in to the reproducible builds they want to do), there's an article about it from 2018: <https://guix.gnu.org/en/blog/2018/bootstrapping-rust/>.

jasonthorsness•6h ago

The difficulty in reproducing builds and steps even from a time as recent as 2011 is somewhat disturbing; will technology stabilize or is this going to get even worse? At what point do we end up with something in-use that we can’t make anymore?

Sharlin•6h ago

Enter Vinge's programmer-archaeologists!

bee_rider•6h ago

I think we must have some software in use for which the compiler or the source code just isn’t around anymore. It probably isn’t a massive problem. There’s just a slow trickle of tech we can’t economically reproduce, but we replace it with better stuff. Or, if it was really crucial, it would become worth paying for, right?

0cf8612b2e1e•6h ago

There was a story where Microsoft patched a program for which they likely lost the source: https://www.bleepingcomputer.com/news/microsoft/microsoft-ap...

Lammy•2h ago

Complete speculation: They might not have had it in the first place or might not have had legal license to modify it themselves. The About Box shown in the article implies Microsoft just licensed MathType from Design Sciences, Inc. DSI got acquired by WIRIS just a few months before that in 2017 which may also have had something to do with it: https://en.wikipedia.org/wiki/MathType

skissane•5h ago

I think with advances in AI-assisted decompilation, we may soon end up in the situation where given a binary you can produce realistic-looking source (sane variable and function names, comments even) which compiles to the same binary, even though non-identical to the original source code

bee_rider•4h ago

Could be, although I don’t think that’ll give them any more HDL to train on (unless they also get access to a whole lot of high end microscopes!)

jcranmer•5h ago

I'd imagine that it's going to end up both getting somewhat better and somewhat worse.

2011 is around the time that programmers start taking undefined behavior seriously as an actual bug in their code and not in the compiler, especially as we start to see the birth of tools to better diagnose undefined behavior issues the compilers didn't (yet) take advantage of. There's also a set of major, language-breaking changes to the C and C++ standards that took effect around the time (e.g., C99 introduced inline with different semantics from gcc's extension, which broke a lot of software until gcc finally switched the default from C89 to C11 around 2014). And newer language versions tend to make obsolete hacky workarounds that end up being more brittle because they're taking advantage of unintentional complexity (e.g., constexpr-if removes the need for a decent chunk of template metaprogramming that relied on SFINAE, a concept which is difficult to explain even to knowledgeable C++ programmers). So in general, newer code is likelier to be substantially more compatible with future compilers and future language changes.

But on the other hand, we've also seen a greater tend towards libraries with less-well-defined and less stable APIs, which means future software is probably going to have a rougher time with getting all the libraries to play nice with each other if you're trying to work with old versions. Even worse, modern software tends to be a lot more aggressive about dropping compatibility with obsolete systems. Things like (as mentioned in the blog post) accessing the modern web with decade-old software is going to be incredibly difficult, for example.

lmm•2h ago

The telephone network was famously thought to be impossible to bootstrap even 50 years ago. We won't ever be able to "black start" our computers unless someone cares enough to put money and effort into it. (Also all technological civilisation is somewhat self-dependent e.g. do you think it would be possible to make microprocessors without running computers?). Possibly reproducible build efforts and things like Guix will make it happen.

kimixa•6h ago

> Modern clang and gcc won't compile the LLVM used back then (C++ has changed too much)

Is this due to changing default values for the standard used, and would be "fixed" by adding "std=xxx" to the CXXFLAGS?

I've successfully built ~2011 era LLVM with no issues with the compiler itself (after that option change) using gcc last year - there were a couple of bugs in the llvm code though that I had to workaround (mainly relying on transitive includes from the standard library, or incorrect LLVM code that is detected by the newer compilers)

One of the big pain points I have with c++ is the dogmatic support of "old" code, I'd argue to the current version's detriment. But because of that I've never had an issue with code version backwards compatibility.

LegionMammal978•5h ago

Even -fpermissive is no longer sufficient for some of the things that appear in the old LLVM codebase. It's mostly related to syntax issues that older compilers accepted even though the standard never permitted them.

o11c•5h ago

Well, one thing I've noticed about LLVM is that it blatantly and intentionally relies on UB. The particular example I encountered probably isn't what causes the version breakage, but it's certainly a bad indicator.

That said, failures in building old software are very often due to one of:

* transitive headers (as you mentioned)

* typedef changes (`siginfo_t` vs `struct siginfo` comes to mind)

* macros with bad names (I was involved in the zlib `ON` drama)

* changes in library arrangement (the ncurses/tinfo split comes to mind, libcurl3/4 conditional ABI change, abuse of `dlopen`)

Most of these are one-line fixes if you're willing to patch the old code, which significantly increases the range of versions supported and thus reduces the number of artifacts you need to build for bootstrapping all the way to a modern version.

ummonk•1h ago

Rather ironic it relies on UB given the extent to which Clang + LLVM insists on interpreting UB in the most creative way possible to optimize code…

15155•5h ago

I imagine you just need to update CA certs and the known_hosts file to get GitHub communication working again.

oasisbob•4h ago

A few more hurdles might involve expectations of SHA-1 cert signing, and TLS1.0 deprecation

eptcyka•5h ago

Can’t say I’m a fan of Nix evangelists pointing their finger at any problem and yelling how it would be solved better by using Nix, but in this case, one could pin a nixpkgs version and all the sources for llvm, gcc and ocaml, and thus have a reproducible bootstrap. Ultimately, it wouldn’t do anything different to what was done manually here, but pinning commits will save the archaelogical burden for the next bootstrapper.

chubot•5h ago

Does re bootstrapping Rust like this actually work? How much work is it?

LegionMammal978•3h ago

Lots of work, you need hundreds of steps across the snapshots, and patches for each one to get them to work. (E.g., the makefile had hardcoded -Werror for ages.) Not to mention that if you want to make it portable, you must always start with the i686 version and cross-compile from there. (Preferably leaving x86 as late as possible: the old LLVM versions are full of architecture-specific quirks.)

LegionMammal978•5h ago

I've done this project myself, based on Ubuntu 20.04 and a whole lot of patchsets [0]. I got up to the 2014-01-20 snapshot before running into weird LLVM stack issues that I couldn't figure out how to resolve. One big annoyance is that the snapshot file refers to some commit hashes that do not appear to point to any surviving public repo, so it takes a fair bit of effort to reconstruct which commits must have been included in the missing commits.

[0] https://github.com/LegionMammal978/rust-from-ocaml

neilv•5h ago

> Debian has maintained both EOL'ed docker images and still-functioning fetchable package archives at the same URLs as 14 years ago.

Debian FTW.

neilv•4h ago

Is there, or could there be, a simple implementation of a compiler for the latest full Rust language (in C, Python, Scheme/Racket, or anything except Rust) that is greatly simplified because, although it accepts the latest full Rust language as input, it assumes the input is correct?

Could this simple non-checking Rust implementation transliterate the real Rust compiler's code, to unchecked C, that is good enough for that minimal-steps, sustainable bootstrapping?

This simple non-checking compiler only has to be able to compile one program, and only under controlled conditions, possibly only on hardware with a ton of memory.

Smaug123•4h ago

Is mrustc "simple" enough? Its purpose is as you describe, and it can bootstrap rustc to version 1.74.0. https://github.com/thepowersgang/mrustc

neilv•4h ago

`mrustc` might be exactly what I wanted. Thank you.

stevefolta•4h ago

No it can't. Not for RISC-V/musl, so I'm sure that must be true for other platforms too.

yjftsjthsd-h•3h ago

So.... It can, just not for a particular target platform? Or am I missing your point?

JoshTriplett•2h ago

Once you've compiled it for one platform, you've re-bootstrapped it, at which point you can use the real compiler to cross-compile for another platform.

charcircuit•4h ago

Rust can selfbootstrap by compiling the rust code for the compiler.

gregorvand•1h ago

Why do I have to use a VPN and pick a US server to access this article?

ZX Spectrum graphics magic: The basics every Spectrum fan should know

What Happens When Clergy Take Psilocybin

Generative AI coding tools and agents do not work for me

Show HN: Canine – A Heroku alternative built on Kubernetes

Snorting the AGI with Claude Code

Benzene at 200

Show HN: Nexus.js - Fabric.js for 3D

Show HN: Chawan TUI web browser

Battle to eradicate invasive pythons in Florida achieves milestone

Selfish reasons for building accessible UIs

Ask HN: How to Deal with a Bad Manager?

Dull Men’s Club

Privacy implications of browsers’ (mis)implementations of Widevine EME (2023)

Open-Source RISC-V: Energy Efficiency of Superscalar, Out-of-Order Execution

OpenAI wins $200M U.S. defense contract

What I talk about when I talk about IRs

Retrobootstrapping Rust for some reason

Show HN: Zeekstd – Rust Implementation of the ZSTD Seekable Format

Blaze (YC S24) Is Hiring

Frogger 2's source code was recovered from a DESTROYED tape [video]

OpenTelemetry for Go: Measuring overhead costs

Working on databases from prison

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Finland warms up the world's largest sand battery, the economics look appealing

Is gravity just entropy rising? Long-shot idea gets another look

Show HN: dk – A script runner and cross-compiler, written in OCaml

Adding public transport data to Transitous

WhatsApp introduces ads in its app

Occurences of swearing in the Linux kernel source code over time

ZX Spectrum graphics magic: The basics every Spectrum fan should know

What Happens When Clergy Take Psilocybin

Generative AI coding tools and agents do not work for me

Show HN: Canine – A Heroku alternative built on Kubernetes

Snorting the AGI with Claude Code

Benzene at 200

Show HN: Nexus.js - Fabric.js for 3D

Show HN: Chawan TUI web browser

Battle to eradicate invasive pythons in Florida achieves milestone

Selfish reasons for building accessible UIs

Ask HN: How to Deal with a Bad Manager?

Dull Men’s Club

Privacy implications of browsers’ (mis)implementations of Widevine EME (2023)

Open-Source RISC-V: Energy Efficiency of Superscalar, Out-of-Order Execution

OpenAI wins $200M U.S. defense contract

What I talk about when I talk about IRs

Retrobootstrapping Rust for some reason

Show HN: Zeekstd – Rust Implementation of the ZSTD Seekable Format

Blaze (YC S24) Is Hiring

Frogger 2's source code was recovered from a DESTROYED tape [video]

OpenTelemetry for Go: Measuring overhead costs

Working on databases from prison

Nanonets-OCR-s – OCR model that transforms documents into structured markdown

Breaking Quadratic Barriers: A Non-Attention LLM for Ultra-Long Context Horizons

Finland warms up the world's largest sand battery, the economics look appealing

Is gravity just entropy rising? Long-shot idea gets another look

Show HN: dk – A script runner and cross-compiler, written in OCaml

Adding public transport data to Transitous

WhatsApp introduces ads in its app

Occurences of swearing in the Linux kernel source code over time

Retrobootstrapping Rust for some reason

Comments