The Origin Story of Merge Queues

https://mergify.com/blog/the-origin-story-of-merge-queues

55•jd__•3h ago

Comments

oftenwrong•2h ago

I think there was an even earlier example of merge trains at Etsy mentioned here: https://pushtrain.club/

This blog post about choosing which commit to test is also relevant and may be of interest: https://sluongng.hashnode.dev/bazel-in-ci-part-1-commit-unde...

zdw•2h ago

This seems to skip the idea of stacked commits plus automatic rebasing, which have been around in Gerrit and other tools for quite a while.

If you read between the lines, the underlying problem in most of the discussion is GitHub's dominance of the code hosting space coupled with it's less than ideal CI integration - which while getting better is stuck with baggage from all their past missteps and general API frailty.

jd__•2h ago

That's a good point. To clarify, Gerrit itself didn't actually do merge queuing or CI gating. Its model was stacked commits: every change was rebased on top of the current tip of main before landing. That ensured a linear history but didn't solve the "Is the whole pipeline still green when we merge this?" problem.

That's why the OpenStack community built Zuul on top of Gerrit: it added a real gating system that could speculatively test multiple commits in a queue and only merge them if CI passed together. In other words, Zuul was Gerrit's version of a merge queue.

wbl•1h ago

Gerrit integrates with try and mergebots I thought.

myelin•2h ago

The Chromium commit queue slightly predates this -- they started using it in 2010.

masklinn•1h ago

while the article does not provide specific dates for the “prehistory” the Not Rocket Science piece refers to an automated integration system working circa 2000~2001:

> February 2 2014, 22:25 […] Thirteen years ago I worked at Cygnus/RedHat […] Ben, Frank, and possibly a few other folks on the team cooked up a system [with a] simple job: automatically maintain a repository of code that always passes all the tests.

sfink•1h ago

This covers part of the problem, the part where your tests are enough to indicate whether the changes are good enough to keep. In that scenario, relying only on fast-forward merges is good enough.

One trickier problem is when you don't know until later that a past change was bad: perhaps slow-running performance tests show a regression, or flaky tests turn out to have been showing a real problem, or you just want the additional velocity of pipelined landings that don't wait for all the tests to finish. Or perhaps you don't want to test every change, but then when things break you need to go back and figure out which change(s) caused the issue. (My experience is at Mozilla where all these things are true, and often.) Then you have to deal with backouts: do you keep an always-green chain of commits by backing up and re-landing good commits to splice out the bad, and only fast-forwarding when everything is green? Or do you keep the backouts in the tree, which is more "accurate" in a way but unfortunate for bisecting and code archaeology?

Roguelazer•1h ago

I think this is ignoring a lot of prior art. Our deploys at Yelp in roughly 2010 worked this way -- you flagged a branch as ready to land, a system (`pushmaster` aka `pushhamster`) verified that it passed tests and then did an octopus merge of a bunch of branches, verified that that passed tests, deployed it, and then landed the whole thing to master after it was happy on staging. And this wasn't novel at Yelp; we inherited the practice from PayPal, so my guess is that most companies that care at all about release engineering have been doing it this way for decades and it was just a big regression when people stopped having professional release management teams and started just cowboy pushing to `master` / `main` on github some time in the mid 2010's.

jd__•1h ago

That's super interesting, thanks for sharing the Yelp/PayPal lineage. You're right: there's probably a lot of prior art in internal release engineering systems that never got much written up publicly.

The angle we took in the blog post focused on what was widely documented and accessible to the community (open-source tools like Bors, Homu, Bulldozer, Zuul, etc.), because those left a public footprint that other teams could adopt or build on.

It's a great reminder that many companies were solving the "keep main green" problem in parallel (some with pretty sophisticated tooling), even if it didn't make it into OSS or blog posts at the time.

qlm•1h ago

Gotta be honest: the AI-ness of both the images and the text in this blog post (as well as your response) leaves a bad taste.

jes5199•1h ago

I don't know what it's like now, but GitHub's internal merge queue circa 2017 was a nightmare. Every PR required you to set aside a full day of babysitting to get it getting merged/deployed - there were too many nondeterministic steps.

You'd join the queue, and then you'd have to wait for like 12 other people in front of you who would each spend up to a couple hours trying to get their merge branch to go green so it could go out. You couldn't really look away because it could be your turn out of nowhere - and you had to react to it like being on call, because the whole deployment process was frozen until your turn ended. Often that meant just clicking "retry" on parts of the CI process, but it was complicated, there were dependencies between sections of tests.

peterldowns•1h ago

No mention of Graphite.dev? Oh, it's written by Mergify, got it.

mlutsky1231•1h ago

(Co-founder of Graphite here) Even better - they didn't mention that Shopify deprecated Shipit in favor of Graphite's merge queue for their new monorepo.

ruuda•1h ago

Also related: https://www.channable.com/tech/automated-deployments. Hoff predates many of the systems in the article.

kccqzy•16m ago

> The motivation was to avoid "merge skew," where changes appear compatible when reviewed in isolation but break once merged into an updated main.

My opinion is that this situation of a merge skew happens rarely enough not to be a major problem. And personally, I think instead of the merge queues described in the article, it would be overall more beneficial to invest in tooling to automatically revert broken commits in your main branch. Merging the PR into a temporary branch and running tests is a good thing, but it is overly strict to require your main branch to be fast forwarded. You can generally set a time limit of one day or so: as long as tests pass when merging the PR onto a main branch less than one day old, you can just merge it.

A.I. As Normal Technology (Derogatory)

Show HN: UltraPlot. A Succinct Wrapper for Matplotlib

Is it possible that these two chips have hardware trojans in them?

Choosing a model for a research platform with real data and metrics

The AI Nerf Is Real

The [ASI] Problem

Dotter: Dotfile manager and templater written in Rust

Show HN: Llmswap – Universal AI SDK and Code Generation CLI

Electro-optical Mott neurons made of niobium dioxide

Charlie Kirk Shot at Utah Valley University

Cybercrooks ripped the wheels off at Jaguar Land Rover

I built one of the fastest real-time transcription apps for Mac

The AI that solved IMO Geometry Problems [video]

Flu jab email mishap exposes students' personal data

Standard Capital

Uncle Sam indicts alleged ransomware kingpin tied to $18B in damages

Show HN: Aras Finder – Create precise Boolean job search links

A 'universal' therapy against the seasonal flu?

You're more likely to reach for that soda when it's hot outside

The Top-Selling Cocktail System

How many federal agencies does it take to regulate AI? Enough to hold it back

Enabling enhanced security for your app in Xcode

Enhance your CLI testing workflow with the new dotnet test

New Posthog Website

Can LLMs replace on call SREs today?

Elon Musk just lost his title as richest person

Debian Experimental: for when Debian Unstable is too stable for you

U.S. Wildfire Fighters to Mask Up After Decades-Long Ban on Smoke Protections

Best practices for Vibe Coding in prod in one video

NASA hasn't found life on Mars yet – but signs are promising