1,145 pull requests per day

https://saile.it/1145-pull-requests-per-day/

116•sailE•1mo ago

Comments

danpalmer•1mo ago

My previous company averaged 2 PRs (and deploys) per engineer per day across a small team. At my current company I'm averaging about 2.5 CLs per day (they're a bit smaller changes). Stripe is good at this, but this is very achievable.

Often the problem is that we put too much into a single change. Smaller changes means easier reviews, better reviews, less risky deploys, forces better tooling for deployments and change management, often lower latency, and even often leads to higher throughput because of less WIP taking up head space and requiring context switching. The benefits are so compounding that I think it's very undervalued in some orgs.

polishdude20•1mo ago

I think better tooling for deployments allows small changes. Not the other way around.

danpalmer•1mo ago

That's sort of what I mean by small changes being a forcing function. The tooling we have available rarely makes this level of small changes untenable, it's just clunky. When you send 1k PRs a day though you'll notice things that are too clunky and fix them, and then that makes it easier to get to and maintain that level of productivity.

smadge•1mo ago

> CL

Googler identified.

Scramblejams•1mo ago

Maybe! Perforce (standard in AAA gamedev) speak is littered with CLs, too.

solumunus•1mo ago

What is a CL?

Balinares•1mo ago

Change list. It's a Perforce term approximately equivalent to "PR", but also, perhaps more familiarly to the HN audience, the term used at Google with the internal fork of Perforce that powers its gigantic monorepo.

danpalmer•1mo ago

Further to this, CLs are essentially a single commit, which means that they are often (but don't have to be) smaller than a PR that might contain a stack of commits. You don't generally do as much rework in later CLs like you might with git commits, generally you'd do all your development iteration on the same CL, rewriting it until it's ready, rather than stacking many commits like in git.

webprofusion•1mo ago

Lol, yeah but a PR won't fix that expired cert.

scripturial•1mo ago

It will if your certs are managed via a git repo.

compumike•1mo ago

... but a small PR might warn you when your renewal automation breaks, before the cert actually expires: https://heiioncall.com/blog/barebone-scripts-to-check-ssl-ce... (a small blog post I wrote with example scripts for checking SSL cert expiry from Bash, Python, Ruby, more) ;)

sailE•1mo ago

I get this comment sometimes and also other comments saying it 404s. The site has been up for years but seems there's still some old DNS floating around. Or some misconfiguration on my side.

dieulot•1mo ago

This is due to one of your AAAA records: ::ffff:157.245.83.16. (The other one is fine.)

CLI:

    dig aaaa saile.it
    curl -v --resolve "saile.it:443:[2a04:4e42:2::775]" "https://saile.it/1145-pull-requests-per-day/"
    curl -v --resolve "saile.it:443:[::ffff:157.245.83.16]" "https://saile.it/1145-pull-requests-per-day/"

GUI:

- https://www.ssllabs.com/ssltest/analyze.html?d=saile.it

- https://toolbox.googleapps.com/apps/dig/

xeromal•1mo ago

PRs merged is the LOC count. Completely useless measure of anything really.

cesnja•1mo ago

Based on DORA research it's correlated with the engineering team/company being a high-performer.

qznc•1mo ago

The "Accelerate" book claims it is not just correlation but even causation: Higher deployment frequency causes better company performance.

franktankbank•1mo ago

Given that its trivially easy to produce unbounded amounts of useless PRs this cannot be a causation.

xeromal•1mo ago

That's kind of what I was feeling. There is nothing that really keeps someone from just making BS tiny PRs

cuttothechase•1mo ago

Is counting the number of pull requests a useful measure of engineering performance ergo product performance and company perf?

Isn't it more like a BS counter that keep incrementing and that is indicative of churn but nothing else reliably.

One of the most low effort, easily to game metric that can be skewed to show anything that the user wants to show.

chhs•1mo ago

In my org they count both the number of pull requests, and the number of comments you add to reviews. Easily gamed, but that's the performance metric they use to compare every engineer now.

darkmarmot•1mo ago

good god, i would quit in a heartbeat.

supportengineer•1mo ago

I’m going to write myself a new mini van this afternoon

tough•1mo ago

YES! YES! YES!

netdevphoenix•1mo ago

It will be gamed soon if it isn't already

panstromek•1mo ago

By itself not, but combined with the rest of DORA metrics it's a pretty good indicator.

JimDabell•1mo ago

> With some napkin math assuming a similar distribution today, that would mean on average each engineer ships at least 1 change to production every 3 days.

This is the important metric. It means there is very little divergence between what’s being worked on and what’s in production. The smaller the difference, the quicker you deliver value to users and the less risky it is to deploy.

qznc•1mo ago

There is research (DORA/Accelerate/DevOps report) that makes a good case that throughput (like number of pull requests) contributes positively to company performance. More precisely the DORA metric is deployment frequency.

triceratops•1mo ago

In the aggregate I'm sure it's useful. To evaluate a single individual contributor it is not.

dgrin91•1mo ago

This sounds like how PMS sometimes tout the number of tickets per sprint. It's not a relevant metric, and it's easily gamed

How many of those prs were small fixes? How many went to low used services, or services only used internally?

They clearly have strong set of tooling for dev ops stuff, and I do believe that stripe has strong technical chops, but this number does not show that they are delivering value. It just shows they are delivery something... Somewhere

m3kw9•1mo ago

Sure, looks cool but really means nothing without looking at the type of changes. I could think of many tricks if I wanted to chase this stat

m3kw9•1mo ago

I would think they have elite testing and QA team to make that happen

testthetest•1mo ago

Maybe counterintuitively, most big tech companies have no dedicated QA & tester roles, rather, they encourage engineers to think about software quality while developing product.

See the book "How Google tests software" (by James A. Whittaker, 2012) and the Pragmatic Engineer blog has a good post on how big tech does QA: https://newsletter.pragmaticengineer.com/p/qa-across-tech

mdaniel•1mo ago

I hope they're not using GitHub; my inbox is unusable from GitHub talking to itself and choosing to stand next to me at the party, and we're not [currently] doing anywhere near that volume

I've gotten some handle on it by some gmail filters but holy hell if you want to make sure I don't see something, smuggle it in a From:github.com email

With the new Copilot makes PRs, I could easily imagine getting to 1145 PRs per day, of which 45 will actually make it across the line and not be filled with horseshit

superblas•1mo ago

FWIW you can go to settings on GitHub and select what you get emails about. There may even be a digest option such that you get one email per day with all the stuff you would’ve received individual emails about IIRC.

AdamJacobMuller•1mo ago

Diminishing returns at some point, but, I think the counterpoint to this is companies who are doing a single huge manual deployment every month (or less) which is scheduled into a 4-hour outage window where many if not all services will be down for this period.

I do agree there isn't a lot of delta between a company doing 2 or 10 deploys a day and a company doing 1,200, but, there's a huge engineering gap between companies who do approximately .03 deploys per day and those doing multiple.

SchemaLoad•1mo ago

Open source projects can be the worst at this stuff. I realise it's all volunteer run so I'm not complaining too much. But so often they end up pushing a versioned release once a year. So you end up finding a bug, going to report it and see it was fixed 8 months ago but still broken in the published package. And then they get afraid to push a new version because it's been so long since the last one that everything has changed.

hellojimbo•1mo ago

Conventional "dev" wisdom says that LOC and PR count don't matter but I think its very easy to combine this data point with the nature of a dev's work to come up with a great heuristic on productivity.

coolcase•1mo ago

It is good for ballparking. PR count is more interesting as if it is high it means you must have a good CI CD pipeline. I worked places where there were limits on commits a day due to monolith plus master CI/CD taking an hour. You then end up thinking of strategies like "I'll do that in the afternoon and get it to that state and it gets merged at this time" and so on. Which is inefficient.

xena•1mo ago

404

mudkipdev•1mo ago

Working fine for me

arp242•1mo ago

Seem to be getting a 404?

https://web.archive.org/web/20250523000025/https://saile.it/...

burnt-resistor•1mo ago

Impact is what matters. Not PRs/diffs or LoC. These are PHB KPIs.

Basically, they're bragging about how busy their engineers are making themselves look.

ijidak•1mo ago

PHB? Pointy haired boss? My first time seeing this.

aaronax•1mo ago

Yes, Dilbert comic reference.

Sytten•1mo ago

More is not always better though. The simplicity of Stripe that made its success is not there anymore, ask anyone that had to build a stripe integration recently.

userbinator•1mo ago

How often and how much you change your codebase is not something to brag about.

deepsun•1mo ago

We do a lot of PRs as well. Most are automatically created and pushed.

darth_avocado•1mo ago

The comments so far are surprising. Yea counting PRs and lines of code isn’t impressive, and yes you may also do them at your own company. Any engineer will tell you, if you push code often and continuously move it to production, regression is inevitable. In finance, at a scale that stripe operates, not making mistakes is very critical. Being able to do what the articles describes is very impressive in any engineering organization. Being able to do that as Stripe is even more impressive.

dakiol•1mo ago

Fuck the mission. Fuck the culture. You are doing nothing but shoveling money into founders, investors and shareholders pockets.

https://news.ycombinator.com/item?id=32165794

bdelmas•1mo ago

What of you have a lot of stock options? You are part of the shareholders.

pc86•1mo ago

Well then you are also bad, of course, because there exist people who are doing worse than you.

eviks•1mo ago

If it's not impressive why is the article so impressed (it's even in the headline)?

And no, not making any mistakes isn't critical, ... some are. You can have a million of UI mistakes and regressions (where is the stat for how many regressions there are?) and what not in the UI of you Stripe Dashboard app without any of them being critical. The "finance" aura doesn't permeate literally everything a finance company does, raising it to the critical level

netsharc•1mo ago

I thought the article was going somewhere, it ended up being a very vapid "Impressive, right? Reflect what you can do to accomplish this in your organization".

Well the actual ending is, "Subscribe to my newsletter!".

arghwhat•1mo ago

In finance, not making mistakes is not at all critical, and mistakes happpens all the time. Regulatory compliance is critical, as it provides a thorough legal defense even in case of failure.

The lack of efficiency in finance (or pharma for that matter) is not driven by a wish for quality, but purely from a fear of stepping outside regulatory compliance with no individual wanting to be responsible for any process optimization that could later be seen as non-compliant.

Younger companies on the other hand might realize that compliance controls are, in fact, something the company defines themselves and can structure and optimize however they'd like, allowing for both high throughput and compliance. It's just hard to implement in older companies where half the staff would fight back as their role exists purely due to poor processes and their overhead.

corianderonion•1mo ago

exactly.

nitwit005•1mo ago

How many of those people are actually working on the core payments flow that they're measuring the uptime of?

I'm sure most people are working on some settings UI, fraud or risk tools, etc.

metalrain•1mo ago

Having minute of downtime per year is quite a big tradeoff.

It makes sense for Stripe, they would lose lot of money when not operating. But in smaller companies you can choose to take more downtime to reduce engineering time.

Instead of skirting around doing gradual data transformations on live data, you could take service offline and do all of it once.

madduci•1mo ago

Is there any description of Stripe's architecture and infrastructure somewhere published?

revskill•1mo ago

Non engineers stop that from reality.

sverhagen•1mo ago

It is undoubtedly very impressive. But once you're set up for it, it's probably easier than saving up the changes and doing a big release at the end of the month, because the amount of change and the amount of risk, per deployment, then is also a lot higher.

Like other commenters here have said, it doesn't mean that I can say "(scoff --) we're doing the same" if I'm doing the same relative number of releases with my tiny team. But it is validating for a small team like mine to see that this approach works at large scale, as it does for us.

coolcase•1mo ago

Yeah I get itchy if the pipeline to prod ain't working for more than 24h for one microservice. I love continuous deployment.

TowerTall•1mo ago

Why do they need to change their software that much that often?

coolcase•1mo ago

Not every PR is a feature. There will be lots of laying groundwork. Stuff going in but not yet activated as a feature flag. Small changes often make roll back easier. Regressions easier to detect. You can blue green your little change and auto detect if it is causing latency or availability issues on 1% then 10% etc. of traffic. This means you can early detect and do easy roll backs as nothing much has changed.

Downside: your code is always a Frankenstein monster of feature flags that need to be cleared up! But hey that's more PRs to boast about.

phtrivier•1mo ago

It completely depends on what they call "a change", and the article does not make that clear.

If they're doing some sort of strict continuous integration, then, a "change" could be a 25 lines function with a 100 lines of unit tests, in the frame of a large project where the function will be used later in a UI component that will only be merged in two weeks.

The fact that it's "deployed" does not mean it's "used" in production as a final thing ; it might very much be "a step in the development".

And, even if they're shipping "feature" (that is, they're deploying the last commit in a project), it does not mean that all millions of users are seeing the change (they could use feature toggles, A/B testing, etc...)

Seen this way, about 2 PRs per day per eng is not unreasonable, and with enough devs, you can reach it.

Finally, they might very well have some automated PRs (i18n, etc..)

thewisenerd•1mo ago

> few nuggets scattered on the internet regarding how Stripe does things (ex. #1, #2, #3) and in general the conclusion is that they have a very demanding but very advanced engineering culture

#3 is "What I Miss About Working at Stripe" (https://every.to/p/what-i-miss-about-working-at-stripe) reminiscing about 15-hour days, missing vacations, and crying at work.

discussed here; https://news.ycombinator.com/item?id=32159752 (131 comments)

tonyhart7•1mo ago

wow, so like Asian company culture???

tough•1mo ago

patio11 is a great read

jgalt212•1mo ago

supremely talented writer. much better than the average New Yorker or Altantic Montly writer.

sandspar•1mo ago

It's hard for outsiders and novices to fathom what an assembly line feels like. Elite performers are often cranking things out at a scale that normal people wouldn't believe.

strken•1mo ago

One PR every three days, while interesting for what it says about their development, has nothing to do with performance.

I had a day last week where I put up ten small PRs. It'd be a bit silly to say I was especially elite that day. I was just making small semi-related PRs for a routine task.

eviks•1mo ago

> The goal is not 1,145 deployments per day. It's removing the friction that makes that pace impossible. What's really stopping you from rapidly shipping value to users?

But you haven't identified any value, just mindless cited some throughput stat. Merge your code changes in 1-letter increments and you can juke the stats even higher!

smadge•1mo ago

One letter increments wouldn’t work because each commit has to leave the program correct. For example committing the first f of a for loop would leave the code with a syntax error. But you often should break up changes into as small as possible commits that leave the code correct. E.g. make one commit that refactors the code (no change in program output), then another change which implements the change in program behavior that motivated the refactor. This generally makes commits easier to review, less likely to contain defects, and easier to attribute bugs and roll back.

riffraff•1mo ago

I forgot who coined the expression but it's a good rule "first make the change easy, then make the change", my top recommendation for incremental changes.

eviks•1mo ago

Just a tiny modification and it would work - do one letter increments only for comments

koliber•1mo ago

That would be clearly malicious with the goal of gaming stats and not bringing any value. I doubt that such dumb behavior would go unnoticed for any considerable time.

The article states that this comes out to be one deployed PR per 3 days per developer. That is clearly doable and does not require any dumb moves like parent suggests.

sailE•1mo ago

I'm not sure why the stat is nitpicked that much. Of course you can completely fake this stat. I have no idea if Stripe does or how much LoC or actual functional changes these deployments do. It's just that they are very confident in shipping changes, all with minimum downtime. Constantly, every day. For a company which does a major part of the financial infrastructure of the web, that's impressive.

In my opinion the goal isn't getting to some arbitrary number, it's making this possible. That there are no process/technical/cultural impediments not allowing you to do changes every single day.

eviks•1mo ago

Because you've put so much emphasis on this stat and imbued it with value without justification, so it makes it very easy to refute (also, the refutation of your central claim isn't a nitpick - this one is)

And you're doing it in this comment again. No, it's not impressive without some actual knowledge on what the changes are about their value. It may very well be that people are heavily pushed to make changes for the sake of making changes, and most of them are not that useful. Or it may be that, as is common with overworked people, they make a lot of mistakes that require further changes to correct them (ok, not the types of mistakes that affect reliability stat much, but then that's not the only category). Or they experiment way too much with those experiments leading nowhere, so it's just wasted effort. Or...

The daily goal is just as artificial. There should be little impediments to implement valuable changes, not to do something daily

riffraff•1mo ago

The article seems surprised that you can do so many changes at scale, but IMO that's the wrong perspective. The larger the scale you have, the easier it must be to ship a change.

Yes, regressions will be more painful if you manage trillions of dollars, but it also means shipping a fix for such regressions needs to be easy and fast, which you can only do if you have a good and frequent CI/CD pipeline.

See also "The Infallible five-tier system for measuring the maturity of your Continuous Delivery pipeline". You should live on "Friday".

https://qntm.org/cd

AstralStorm•1mo ago

Incorrect, shipping a fix quickly is relatively irrelevant, being able to roll back a busted change to a working configuration is critical.

The exception is security issues. But these usually require actual thinking to be fixed, so no, you're not getting volume in the first place.

Preferably not breaking things while doing your mostly cosmetic or preparatiry changes rather than patching afterwards limits the scope of this kind of fix churn. And how to know you didn't? Proper functional and integration tests is how.

cadamsdotcom•1mo ago

Something to question though: does the A/B test system generate a changeset when an engineer flips a feature flag or changes the cohorts or percentage it’s rolled out to? Those are very safe changes generated by a template and with easy (maybe even automated) rollback, so they can happen all the time without really risking downtime.

There’ll still be plenty of changes made by humans. But some of those 1145 per day are so low-risk that they’re almost better off making more of them.

AstralStorm•1mo ago

1145 really untested changes pushed to prod per day. :)

No, unit tests do not count.

hoppp•1mo ago

Maybe they make a lot of very small PR-s? Each one should be reviewed by more than 1 person so keeping it very small is the only way imho

nektro•1mo ago

nearly once per minute if in the same monorepo

ath3nd•1mo ago

I don't know if it's jusst me, but a company with 8000 devs having 1000 PRs per day doesn't look impressive at all? That means...a dev is producing a PR less often than a once a week on average? That's not only underwhelming but downright bad. Their uptime is much more impressive here, but it still doesn't strike me as anything exceptional.

If you use tooling like renovate/dependabot and run them daily over a large list of repos, you already get your 1000 PRs automatically, without humans even looking at the codebase.

Stop Hiding My Controls: Hidden Interface Controls Are Affecting Usability

Local-first software (2019)

Cod Have Been Shrinking for Decades, Scientists Say They've Solved Mystery

Techno-Feudalism and the Rise of AGI: A Future Without Economic Rights?

Operators, Not Users and Programmers

Optimizing Tool Selection for LLM Workflows with Differentiable Programming

How to Network as an Introvert

Europe's first geostationary sounder satellite is launched

Serving 200M requests per day with a CGI-bin

macOS Icon History

What a Hacker Stole from Me

Speeding up PostgreSQL dump/restore snapshots

X-Clacks-Overhead

Atomic "Bomb" Ring from KiX (1947)

7-Zip 25.00

The Calculator-on-a-Chip (2015)

Yet Another Zip Trick

WinUAE 6 Amiga Emulator

ClojureScript from First Principles – David Nolen [video]

The Right Way to Embed an LLM in a Group Chat

Haskell, Reverse Polish Notation, and Parsing

Seine reopens to Paris swimmers after century-long ban

The Hell of Tetra Master

Parametric shape optimization with differentiable FEM simulation

QSBS Limits Raised

Is It Cake? How Our Brain Deciphers Materials

What 'Project Hail Mary' teaches us about the PlanetScale vs. Neon debate

Gecode is an open source C++ toolkit for developing constraint-based systems (2019)

Pet ownership and cognitive functioning in later adulthood across pet types

Build Systems à la Carte (2018) [pdf]

Stop Hiding My Controls: Hidden Interface Controls Are Affecting Usability

Local-first software (2019)

Cod Have Been Shrinking for Decades, Scientists Say They've Solved Mystery

Techno-Feudalism and the Rise of AGI: A Future Without Economic Rights?

Operators, Not Users and Programmers

Optimizing Tool Selection for LLM Workflows with Differentiable Programming

How to Network as an Introvert

Europe's first geostationary sounder satellite is launched

Serving 200M requests per day with a CGI-bin

macOS Icon History

What a Hacker Stole from Me

Speeding up PostgreSQL dump/restore snapshots

X-Clacks-Overhead

Atomic "Bomb" Ring from KiX (1947)

7-Zip 25.00

The Calculator-on-a-Chip (2015)

Yet Another Zip Trick

WinUAE 6 Amiga Emulator

ClojureScript from First Principles – David Nolen [video]

The Right Way to Embed an LLM in a Group Chat

Haskell, Reverse Polish Notation, and Parsing

Seine reopens to Paris swimmers after century-long ban

The Hell of Tetra Master

Parametric shape optimization with differentiable FEM simulation

QSBS Limits Raised

Is It Cake? How Our Brain Deciphers Materials

What 'Project Hail Mary' teaches us about the PlanetScale vs. Neon debate

Gecode is an open source C++ toolkit for developing constraint-based systems (2019)

Pet ownership and cognitive functioning in later adulthood across pet types

Build Systems à la Carte (2018) [pdf]

1,145 pull requests per day

Comments