If all the world were a monorepo

https://jtibs.substack.com/p/if-all-the-world-were-a-monorepo

73•sebg•3d ago

Comments

esafak•3h ago

> In what other ecosystem would a top package introduce itself using an eight-variable equation?

That's the objective function of Hastie et al's GLM. I had a good chuckle when I realized the author's last name is Tibshirani. If you know you know.

bryanrasmussen•3h ago

Robert Tibshirani has a daughter named Julie.

david_draco•1h ago

And if I don't know, can I know?

esafak•1h ago

Hastie and Tibshirani wrote a famous book on ML (https://hastie.su.domains/ElemStatLearn/), and extended GLMs into GAMs: https://en.wikipedia.org/wiki/Generalized_additive_model

kazinator•3h ago

> But… CRAN had also rerun the tests for all packages that depend on mine, even if they don’t belong to me!

When you propose a change to something that other things depend on, it makes sense to test those dependents for a regression; this is not earth shattering.

If you want to change something which breaks them, you have to then do it in a different way. First provide a new way of doing something. Then get all the dependencies that use the old way to migrate to the new way. Then when the dependents are no longer relying on the old way, you can push out a change which removes it.

maxbond•3h ago

> When declaring dependencies, most packages don’t specify any version requirements, and if they do, it’s usually just a lower bound like ‘grf >= 1.0’.

I like the perspective presented in this article, I think CRAN is taking an interesting approach. But this is nuts and bolts. Explicitly saying you're compatible with any future breaking changes!? You can't possibly know that!

I get that a lot of R programmers might be data scientists first and programmers second, so many of them probably don't know semver, but I feel like the language should guide them to a safe choice here. If CRAN is going to email you about reverse dependencies, maybe publishing a package with a crazy semver expression should also trigger an email.

derefr•3h ago

CRAN’s approach here sounds like it has all the disadvantages of a monorepo without any of the advantages.

In a true monorepo — the one for the FreeBSD base system, say — if you make a PR that updates some low-level code, then the expectation is that you 1. compile the tree and run all the tests (so far so good), 2. update the high-level code so the tests pass (hmm), and 3. include those updates in your PR. In a true centralized monorepo, a single atomic commit can affect vertical-slice change through a dependency and all of its transitive dependents.

I don’t know what the equivalent would be in distributed “meta-monorepo” development ala CRAN, but it’s not what they’re currently doing.

(One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations, which automatically push both “dependency-computed” PRs to the dependents’ repos, while also pushing those same patches as temporary forced overlays onto releases of dependent packages until such time as the related PRs get merged. So your dependents’ tests still have to pass before you can release your package — but you can iteratively update things on your end until those tests do pass, and then trigger a simultaneous release of your package and your dependent packages. It’s then in your dependents’ court to modify + merge your PR to undo the forced overlay, asynchronously, as they wish.)

joek1301•2h ago

> One hypothetical approach I could imagine, is that a dependency major-version release of a package can ship with AST-rewriting-algorithm code migrations

Jane Street has something similar called a "tree smash" [1]. When someone makes a breaking change to their internal dialect of OCaml, they also push a commit updating the entire company monorepo.

It's not explicitly stated whether such migrations happen via AST rewrites, but one can imagine leveraging the existing compiler infrastructure to do that.

[1]: https://signalsandthreads.com/future-of-programming/#3535

chii•56m ago

> In a true monorepo ...

ideally yes. However, such a monorepo can become increasingly complex as the software being maintained becomes larger and larger (and/or more and more people work on it).

You end up with massive changes - which might eventually become something that a single person cannot realistically contain within their brain. Not to mention clashes - you will have people making contradictory/conflicting changes, and there will have to be some sort of resolution mechanism outside (or the "default" one, which is first come first served).

Of course, you could "manage" this complexity by attributing api boundary/layers, and these api changes are deemed to be important to not change too often. But that simply means you're a monorepo only in name - not too different from having different repos with versioned artefacts with a defined api boundary.

skybrian•26m ago

Yes, it's nice when you can update arbitrarily distant files in a single commit. But when an API is popular enough to be used by dozens of independent projects, this is no longer practical. Even in a monorepo, you'll still need to break it up, adding the new API, gradually migrating the usages, and then deleting the old API.

jiggawatts•2h ago

This (with some tweaks) is what I envision the future of NPM, Cargo, and NuGet should look like.

Automated tests, compilation by the package publisher, and enforcement of portability flags and SemVer semantics.

haberman•2h ago

This was an interesting article, but it made me even more interested in the author's larger take on R as a language:

> In the years since, my discomfort has given away to fascination. I’ve come to respect R’s bold choices, its clarity of focus, and the R community’s continued confidence to ‘do their own thing’.

I would love to see a follow-up article about the key insights that the author took away from diving more deeply into R.

ants_everywhere•1h ago

I genuinely enjoy R. I use it for calculations daily. In comparison using Python feels tedious and clunky even though I know it better.

> CRAN had also rerun the tests for all packages that depend on mine, even if they don’t belong to me!

Another way to frame this is these are the customers of your package's API. If you broke them you are required to ship a fix.

I see why this isn't the default (e.g. on GitHub you have no idea how many people depend on you). But the developer experience is much nicer like this. Google, for example, makes this promise with some of their public tools.

Outside the word of professional software developers, R is used by many academics in statistics, economics, social sciences etc. This rule makes it less likely that their research breaks because of some obscure dependency they don't understand.

0xbadcafebee•53m ago

The author is a little confused. A system that blocks releases on defects and doesn't pin versions is continuous integration, not a monorepo. The two are not synonymous. Monorepos often use continuous integration to ensure their integrity, but you can use continuous integration without a monorepo, and monorepos can be used without continuous integration.

> But the migration had a steep cost: over 6 years later, there are thousands of projects still stuck on an older version.

This is a feature, not a bug. The pinning of versions allows systems to independently maintain their own dependency trees. This is how your Linux distribution actually remains stable (or used to, before the onslaught of "rolling release" distributions, and the infection of the "automatically updating application" into product development culture, which constantly leaves me with non-functional Mobile applications whereupon I am forced to update them once a week). You set the versions, and nothing changes, so you can keep using the same software, and it doesn't break. Until you choose to upgrade it and deal with all the breaking shit.

Every decision in life is a tradeoff. Do you go with no version numbers at all, always updating, always fixing things? Or do you always require version numbers, keeping things stable, but having difficulty updating because of a lack of compatible versions? Or do you find some middle ground? There are pros and cons to all these decisions. There is no one best way, only different ways.

summis•3m ago

For me the comparison to monorepo made a lot sense. One of the main features of monorepo is maintaining a DAG of dependencies and use that to decide which tests to run given a code change. CRAN package publishing seems to follow same idea.

cortesoft•14m ago

I feel like if more package repositories did this, you would end up just finding more and more workarounds and alternative distribution methods.

I mean, just look at how many projects use “curl and bash” as their distribution method even though the project repositories they could use instead don’t even require anything nearly as onerous as the reverse dependency checks described in this article. If the minimal requirements the current repos have are enough to push projects to alternate distribution, I can’t imagine what would happen if it was added.

pabs3•9m ago

Debian is kind of like that, except packages broken by upgrades are mostly just removed.

Less is safer: How Obsidian reduces the risk of supply chain attacks

Things managers do that leaders never would

If all the world were a monorepo

Hidden risk in Notion 3.0 AI agents: Web search tool abuse for data exfiltration

Feedmaker: URL + CSS selectors = RSS feed

A 3D-Printed Business Card Embosser

Ants that seem to defy biology – They lay eggs that hatch into another species

Show HN: WeUseElixir - Elixir project directory

Internet Archive's big battle with music publishers ends in settlement

Show HN: Zedis – A Redis clone I'm writing in Zig

Ruby Central's Attack on RubyGems [pdf]

The best YouTube downloaders, and how Google silenced the press

Faster Argmin on Floats

Three-Minute Take-Home Test May Identify Symptoms Linked to Alzheimer's Disease

Starfront Observatories

Kernel: Introduce Multikernel Architecture Support

An untidy history of AI across four books

Your very own humane interface: Try Jef Raskin's ideas at home

R MCP Server

Shipping 100 hardware units in under eight weeks

Trump to impose $100k fee for H-1B worker visas, White House says

Mini: Tonemaps (2023)

Show the Physics

Time Spent on Hardening

The health benefits of sunlight may outweigh the risk of skin cancer

Xmonad seeking help for Wayland port (2023)

The Economic Impacts of AI: A Multidisciplinary, Multibook Review [pdf]

Safepoints and Fil-C

Revamping an Old TV as a Gift (2019)

Nostr

If all the world were a monorepo

Comments

Less is safer: How Obsidian reduces the risk of supply chain attacks

Things managers do that leaders never would

If all the world were a monorepo

Hidden risk in Notion 3.0 AI agents: Web search tool abuse for data exfiltration

Feedmaker: URL + CSS selectors = RSS feed

A 3D-Printed Business Card Embosser

Ants that seem to defy biology – They lay eggs that hatch into another species

Show HN: WeUseElixir - Elixir project directory

Internet Archive's big battle with music publishers ends in settlement

Show HN: Zedis – A Redis clone I'm writing in Zig

Ruby Central's Attack on RubyGems [pdf]

The best YouTube downloaders, and how Google silenced the press

Faster Argmin on Floats

Three-Minute Take-Home Test May Identify Symptoms Linked to Alzheimer's Disease

Starfront Observatories

Kernel: Introduce Multikernel Architecture Support

An untidy history of AI across four books

Your very own humane interface: Try Jef Raskin's ideas at home

R MCP Server

Shipping 100 hardware units in under eight weeks

Trump to impose $100k fee for H-1B worker visas, White House says

Mini: Tonemaps (2023)

Show the Physics

Time Spent on Hardening

The health benefits of sunlight may outweigh the risk of skin cancer

Xmonad seeking help for Wayland port (2023)

The Economic Impacts of AI: A Multidisciplinary, Multibook Review [pdf]

Safepoints and Fil-C

Revamping an Old TV as a Gift (2019)

Nostr