I don't know what they're doing where you can do code reviews in 5-10 minutes, but in my decades doing this that only works for absolutely trivial changes.
You can?
https://google.github.io/eng-practices/review/developer/smal...
Isn't that exactly how Google's latest big cloud outage happened?
EDIT: referring to https://news.ycombinator.com/item?id=44274563
I should add that the idea complexity relates to LoC is also nonsense. Everyone that's been doing this for a while knows that what kills people are those one line errors which make assumptions about the values they are handling due to the surrounding code.
Say you're upgrading to a new library, it has a breaking change to an API. First, add `#if`s or similar around each existing change to the API that check for the existing library vs the new version, and error if the new version is found. No behavior change, one line of code, trivial PR. One PR per change. Bug your coworker for each.
Next, add calls to the new API, but don't actually change to use them (the `#if`s won't hit that condition). Another stack of trivial PRs. Your coworker now hates you.
Finally, swap the version over. Hopefully you tested this. Then make a final PR to do the actual upgrade.
For something less trivial than a single breaking upgrade, you can do the same shit. Conditionally compile so that your code doesn't actually get used in the version you do the PR in, you can split out to one PR per character changed! It'll be horrible, everyone will hate you, but you can split changes down to ridiculously small sizes.
Splitting the change does not prevent you from looking at diffs of any combination of those commits. (Including whole pr) You're not losing anything.
> at least 10x longer because an average approval time is often more than a day.
Why do you think it would take longer to review? Got any evidence?
No one on the team is just sitting there refreshing the list of PRs, ready to pick one up immediately. There's a delay between when the PR is marked as ready and when someone can actually get to it. Everyone is trying to get work done and have some time daily in flow state.
Imagine you have a change; you could do it as one PR that takes 1 hour to review, or 3 small PRs that each take 15 mins to review. The time spent in review may even be shorter for the small PRs, but if each PR has a delay of 1 hour before a reviewer can get to it, then the 3 PRs will take almost 4 hours before they're done, as opposed to just 2 hours for one big PR.
1. I can't submit pieces until I have the final version. PRs go up at the same time and can be reviewed one after another immediately.
2. There's a very specific split that makes that feature two features in reality. Like adding a plugin system and the first plugin. Then the first part gets submitted while I still work on the second part and there's no delay on my side, because I'm still developing anyway.
Basically, I've never seen the "but if each PR has a delay of 1 hour before a reviewer can get to it," getting serialised in practice. It's still either one time or happening in the background.
But I trust my colleagues to do good reviews when I ask them to, and to ignore my PRs when I don't. That's kind of the way we all want it.
I regularly ask for a review of specific changes by tagging them in a comment on the lines in question, with a description of the implications and a direct question that they can answer.
This, "throw the code at the wall for interpretation" style PR just seems like it's doomed to become lower priority for busy folks. You can add a fancy narrative to the comments if you like, but the root problem is that presenting a change block as-is and asking for a boolean blanket approval of the whole thing is an expensive request.
There's no substitute for shared understanding of the context and intent, and that should come out-of-band of the code in question.
PRs should be optional, IMHO. Not all changes require peer review, and if we trust our colleagues then we should allow them to merge their branch without wasting time with performative PRs.
Part of the difference is the idea you can catch all problems with piecemeal code review is nonsense, so you should have at least some sweeping QA somewhere.
At $DAY_JOB we need approvals from peers due to industry regulation.
If you find that outcomes are the same by making approvals optional at that stage, then do so with accompanied justification.
No single person being able to make changes to a system is a tenant of that.
Ideally, yes. After a decade and something' under ZIRP, seems a lot of workers never had incentive to remain conscious of their intents in context long enough to conduct productive discourse about them. Half of the people I've worked with would feel slighted not by the bitterness the previous sentence, but by its length.
There's an impedance mismatch between what actually works, and what we're expected to expect to work. That's another immediate observation which people to painfully syntaxerror much more frequently than it causes them to actually clarify context and intent.
In that case the codebase remains the canonical representation of the context and intent of all contributors, even when they're not at their best, and honestly what's so bad about that? Maybe communicating them in-band instead of out-of-band might be a degraded experience. But when out-of-band can't be established, what else is there to do?
I'd be happy to see tools that facilitate this sort of communication through code. GitHub for example is in perfect position to do something like that and they don't. Git + PRs + Projects implement the exact opposite information flow, the one whose failure modes people these days do whole conference talks about.
Minimally, I would like context for the change, why it required a change to this part of the codebase, and the thought process behind the change. Sometimes but not often enough I send the review back and ask them for this info.
IMO many software developers are just not fast enough at writing or language so providing an explanation for their changes is a lot of work for them. Or they are checked out and they just followed the AI or IDE until things worked, so they don't have explanations to provide. People not taking the time is what makes reviews performative.
… the why is important
> IMO many software developers … don't have explanations to provide. People not taking the time is what makes reviews performative.
… a lot of developers only consider the how.
i’ve had a lot of experiences of “once my PR is submitted that’s my work/ticket finished” kind of attitude.
i spent a year mentoring some people in a gaming community to become dev team members. one of the first things i said about PRs was — a new PR is just your first draft, there is still more work to do.
it helped these folks were brand spanking new to development and weren’t sullied by some lazy teacher somewhere.
The why is someone else's job, so the developers should just ask them for a blurb to put in the PR for context, along with a note to the reviewer to ask that person for even more context if necessary.
I think this is the overwhelming factor, software engineering doesn't select for communication skills (and plenty of SEs will mock that as a field of study), or at least most SEs don't start out with them.
Who are these people? I've never encountered that. In my experience engineers who aren't great at communication freely own up to it.
# Goal (why is this change needed at all)
# What I changed and why I did it this way
# What I'm not doing here and how I'll follow up
# How I know it works (optional section, I include this only for teams with lots of very junior engs: "added a test" is often sufficient)
Instead I request that it is self reviewed with context added, prior to requesting re-review.
I also tend to ask the question, “are any of these insights worth adding as comments directly to the code?”
9/10 the context they wrote down should be well thought out comments in the code itself. People are incredibly lazy sometimes, even unintentionally so. We need better lint tools to help the monkey brain perform better. I wish git platforms offered more UX friendly ways to force this kind of compliance. You can kind of fake it with CI/CD but that’s not good enough imo.
A nice side effect is that going through a self review and adding comments to the PR has helped me catch innumerable things that my coworkers never had to call me on.
It's rubber ducking.
- 90% of the time when you self-review your own PR, you're going to spot a bug or some incorrect assumption you made along the way. Or you'll see an opportunity to clean things up / make it better.
- Self-reviewing and annotating your reasons/thought process gives much more context to the actual reviewer, who likely only has a surface level understanding of what you're trying to do.
- It also signals to your team that you've taken the time to check your assumptions and verify you're solving the problem you say you are in the PR description.
Self-review is very, very helpful.
You could argue this is what commits are for, but given how people use GitHub and PRs, it gives some extra visibility.
And if you're going to use AI to assist you when writing the code I would argue this self-review step is 100% mandatory.
Adding context to both your commits and your code review tools pull requests / merge requests makes everyone's lives better. Including future you, who inevitably is looking at the PR or commit in the future due to an incident caused by said change.
I have been following this personal rule for well over a decade, and have never regretted it.
If you don't know them, please realize your code isn't automatically a gift everybody waited for, you may see it that way, but from the other side this isn't clear until someone put in the work to figure out what you did.
In short: added code produces work. So the least you should do is try reducing that work by making it easy to figure out what your code is and isn't.
Sum up what changes you made (functionally), why you made them, which choices you made (if any) and why and what the state of the PR code is in your own opinion. Maybe a reasoning why it is needed, what future maintenance may look like (ideally low). In essence, ask yourself what you'd like to know if someone appeared at the door and gave you a thumb drive with a patch for your project and add that knowledge.
Also consider to add a draft PR for bigger features early on. This way you can avoid programming things that nobody wanted, or someone else was already working on. You also give maintainers a way to steer the direction and/or decline before you put in the work.
You can split a big feature in N MRs and that doesn’t necessarily mean the N MRs are easier to understand (and so to review) than a single big MR. Taking into account the skills of the average software engineer, I prefer to review a single big MR than N different and not very well connected MRs (the classic example is that MR number N looks good and innocent, but then MR number N+1 looks awful… but since MR number N was already approved and merged the incentives are not there to redo the work)
OK.
There is little you can review properly in 10 minutes unless you were already pairing on it. You might have time to look for really bad production-breaking red flags maybe.
Remember the underlying reasons for PR. Balance between get shit done and operational, quality and tech debt concerns. Depending on what your team needs you can choose anything from no review at all to 3x time reviewing than coding. What is right depends on your situation. PR is a tool.
Depends on the specific changes of course, but generally speaking.
300 lines is nothing in some boilerplate-heavy codebases I've worked at.
After seeing the same patterns for the hundredth time, it's easy to detect deviations. Not to mention linters and typing helps a lot too.
Not a fan of those but oh well.
From my experience most of the issues I find are actually from this type of observation rather than actually reading the code and imagining what it does in my head.
> A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory.
I've found LoC doesn't matter when you split up commits like they suggest. What does matter is how controversial a change is. A PR should ideally have one part at most that generates a lot of discussion. The PR that does this should ideally also have the minimal number of commits (just what dossn't make sense standalone). Granted this take experience generally and experience with your reviewers which is where metrics like LoC counts can come in handy.
It’s not really something you can easily enforce with automation, so basically unachievable unless you are like Netflix and only hiring top performers. And you aren’t like Netflix.
(granted, I know VCS like it are still good for assets)
Many people learn to game this to make their "numbers" appear good i.e. high number of CRs and low revisions per CR.
I do see the value in breaking down larger chunks of work into logically smaller units of work and then produce multiple pull requests where needed.
But some people are really clever and influential and manage to game these numbers into "apparent success".
- Keep PR messages short and to the point. - use as many commits as you need, it's all the same branch. Squash if you want, I think it hides valuable meta. - put the ticket in the branch name. Non negotiable. - Update your ticket with progres. Put as much details as you can, as if you were writing to someone who's picking up the task after you. - add links to your ticket. Docs, readme, o11y graphs, etc. - Link ticket in PR for easy access to additional info - Admit if you don't understand what your looking at. Better to pair and keep moving forward. - if you request changes, stay available for conversation and approval for the next few hours. - punt the review if you don't feel like you can legitimately engage with the content right now. Make sure you communicate it though. - Avoid nits. This causes a loss in momentum and v rarely is worth changing.
- Are you trying to make sure that more than one human has seen the code? Then simply reading through a PR in 10 minutes and replying with either a LGTM or a polite version of WTF can be fine. This works if you have a team with good taste and a lot of cleanly isolated modules implementing clear APIs. The worst damage is that one module might occasionally be a bit marginal, but that can be an acceptable tradeoff in large projects.
- Does every single change need to be thoroughly discussed? Then you may want up-front design discussions, pairing, illustrated design docs, and extremely close reviews (not just of the diffs, but also re-reviewing the entire module with the changes in context). You may even want the PR author to present their code and walk throuh it with one or more people. This can be appropriate for the key system "core" that shapes everything else in the system.
- Is half your code written by an AI that doesn't understand the big picture, that doesn't really understand large-scale maintainability, and that cuts corners and _knowingly_ violates your written policy and best practices? Then honestly you're probably headed for tech debt hell on the express train unless your team is willing to watch the AI like hawks. Even one clueless person allowing the AI to spew subtlety broken code could create a mess that no number of reviewers could easily undo. In which case, uh, maybe keep everything under 5,000 lines and burn it all down regularly, or something?
- the baseline "can I assume this person knows what they're doing?" level is higher
- making the "create PR" process take longer in order to make the review process faster is only a tradeoff of the time within the team
- if something is wrong with the committed code, the person who wrote it is going to be around to fix it
But in open source projects, there are much more often contributions by people outside the "core" long-term development team, where:
- you can't make the same assumptions that the contributor is familiar with the codebase, so you need to give things extra scrutiny
- there are often many fewer people doing the code review than there are submitting changes, so a process that requires more effort from the submitter in order to make the reviewer's job easier makes sense
- if there's a problem with the code, there's no guarantee that the submitter will be available or interested in fixing it once it's got upstream, so it's more important to catch subtle problems up front
and these tend to mean that the code-review process is tilted more towards "make it easy for reviewers, even if that requires more work from the submitter".
It's also more important to have good tools to analyze subtle problems down the line, thus increasing the importance of bisection and good commit messages.
An underrated benefit of "make it easy for reviewers" is that when a bug is found, everybody becomes a potential reviewer. Thus the benefit does not finish when the PR is merged.
I trust my colleagues to do the same (and they often do).
I can't imagine working in an environment where this is a theater.
Though I do appreciate the shoutout to adding tests in CR. But returning a PR solely because it doesn’t have tests, is effective, but a little performative too. It kind of like publicly executing someone, theirs gotta be some performance for it to be a deterrent. If something doesn’t have tests my review is going to be a very short performance where I pretend read the rest of the code. Then immediately send it back.
And reviews are not that. Systems are complex, and having a mental model of complex systems is difficult. Everyone has blind spots. A fresh pair of eyes can often spot what who was coding would not.
> But returning a PR solely because it doesn’t have tests, is effective, but a little performative too.
And this is not what I said. I spoke of suggesting extra tests. A scenario that wasn't covered, for example.
And that basically describes all of programming: we are building metaphors that will run electricity at a higher or lower voltage, and be translated again into something meaningful to a different human.
In many ways, all we are doing is performing. And that is some of what makes this job challenging: the practices that build software well are all just ways of checking how humans will interact with the ones and zeros we've encoded.
Returning a PR because it doesn't have tests means that code will have automated validation, which is a real change. It also means the code will be written in a testable way: too often we don't realize we wrote code in a way that is hard to test unless we try to write the tests. And on a larger level, it means that this team of engineers will learn and use the practices and tools that lead to testable code and effective tests and more easily-changeable code.
It makes total sense to not keep reading if there aren't tests, because adding the tests can be expected to change the code. But just because that is a performance doesn't mean it doesn't profoundly change the world.
One very valuable skill for those of us who have experienced productive collaboration is learning how to introduce it to new places.
Sometimes that means telling the executives that their promotion process is making them less successful. Sometimes it means wandering around PRs leaving useful positive comments to seed the idea that PRs can be useful. Sometimes it means pointing out tests only when there is a bug, so people can experience how great it is to follow the practices that keep us from introducing bugs in the first place.
I wish that more CS programs would explicitly teach their students critique skills, the way art and music and english and math programs do. But until then, we're counting on engineers getting lucky and landing in a functional workplace like yours.
Completely park other tasks, spend time on the review and record that time appropriately.
There's nothing wrong with saying you spent the previous day doing a large review. It's a part of the job, it is "what you're working on".
No thank you. Talking to future ME, I don't need to know how I got to what I want me to look at.
A squashed ticket-by-ticket set of merges is enough for me.
However, in more than a decade of software development, I don't think I've ever got much use out of commit messages. The only reason I'd even look at them is if git blame showed a specific commit introduced a bug, or made some confusing code. And normally the commit message has no relevant information - even if it's informative, it doesn't discuss the one line that I care about. Perhaps the only exception would be one line changes - perhaps a change which changes a single configuration value alongside a comment saying "Change X to n for y reason".
Comments can be a bit better - but they have the nasty habit of becoming stale.
> normally the commit message has no relevant information
Maybe that's why you've never got much use of them?
If your colleagues or yourself wrote better commits, they could have been of use.
An easily readable history is most important in open source projects, when ideally random people should be able to verify what changed easily.
But it can also be very useful in proprietary projects, for example to quickly find the reason of some code, or to facilitate future external security reviews (in the very rare case where they're performed).
If it's a pristine, linear, commit story, sure.
If it includes merges commits, "fix" commits, commits that do more than one thing, detours, side-quests, unrelated refactors then squashing is 100x better.
I can't win with HN critics. If I talk about someone else looking, then I'm assuming. If I talk about myself, then I'm being too self-centered (in the oblique sense you reference). I am very aware of how this works across teams of people, not just myself, since I'm in industry.
But the thing is: this code is terrible and huge chunks of it are a unholy mix and match of code written for very specific purpose for this or that client, with this very weird "falsely generalized" code. I don't know how to call that: you have some very specific code, but you insert useless and probably buggy indirections everywhere so that it can be considered "general". The worst kind of code.
Anyways, I was asked by my boss to do such a review. I look at it and I realize that building a database setup to be able to properly run that code is going to take me weeks because I'm going to have to familiarize myself with tons and tons of modules I don't know about.
So I write down my estimate in our tracker: 1 month.
He comes back to me, alarmed. A whole month? Well yeah, otherwise I can't even run the code.
All you have to do is look at the code! What? No way, that ain't a review. Well, I ask you to do it as such. I'm not writing LGTM there.
So I was never asked to do reviews there again (in fact, I stopped working on OpenERP at all), but I could see "LGTM" popping up from my colleagues. By the way, on OpenERP tracker, all you ever saw in review logs was "LGTM" and minor style suggestions. Nothing else. What a farce.
So yeah, as the article says, there are some "LGTM-inducing" type of PRs, but the core of the problem is developers accepting to do this "LGTM-stamping" in the first place. Without them, there would only be reviewable PRs.
If you still need to move fast, then don't.
This is the "don't run in the hallways" version of software culture, but I would contend that you should choose your pace based on your situation. It's just like gradient descent really. The be efficient sometimes you need to make big hops, sometimes small ones.
I contend that learning the art of story telling through a stack of patches is just as important and, once learned, comes just as naturally as utilizing vocabulary, grammar, syntax and style with the written word.
If you need to move fast for the next two weeks, sure. If you need to move fast for the next year, you are better off collaborating.
I'm not advocating either way as superior, but cowboy coding shouldn't mean that you don't pay your tech debt. It just means that it's much more common to roll a bug fix or small factoring improvement in with a feature, probably because you were already touching that code and testing it.
If prod bugs are so critical that there will be a rollback and a forensic retrospective on each one, then yeah you should bite the bullet and use all the most defensive PR tactics. If prod bugs have small costs and you can quickly "roll forwards" (ship fixes) then it's better to get some free QA from your users, who probably won't mind the occasional rough edge if they're confident that overall quality is OK and bugs they do find won't stay unfixed for years.
You can only get basic tweaks accepted. The sunk-cost fallacy is a huge force.
Maybe I've only worked at crappy places
If you aren't making your teammates better, and they aren't making you better, you will never be able to be as good as a team that is greater than the sum of its parts. Individual genius is consistently beat by professional collaboration.
A decent programmer can write code they can effectively work with. The really great programmers write code that even interns can effectively work with. The only way to get to that level is to get good at eliciting and taking in feedback from the people we want to be effective with our code.
It doesn't necessarily mean doing exactly what we are told: it means understanding the why of a comment, what underlying flaws or confusions a comment is pointing towards. It means encouraging people to ask questions in code reviews, rather than just leave commands: often a "how does this manage to do X?" comment points to a place where bugs were hiding anyway, even if also it is a chance to share a language feature.
Many engineers work in companies where being a really great programmer doesn't get you any points. Often the only reward for writing code that is easily modified later is the gratitude of a future developer asked to make a change to it years down the line.
But I am that developer often enough that that's still made the journey worth it for me.
Confused developers are just unable to create such reasoning.
Which is far more valuable than ten minutes of extra typing.
"How do you create a PR that can be reviewed in 5-10 minutes? By reducing the scope. A full feature should often be multiple PRs. A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory."
The problem with doing this is if you're building something a lot bigger and more complex than 500 lines of code, splitting that up across multiple PR's will result in:
- A big queue of PR's for reviewers to review
- The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
- You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
The right answer for the size of a PR is NOT in lines of code. Exercise judgement as to what is logically easier to review. Sometimes bigger is actually better, it depends. Learn from experience, communicate with each other, try to be kind when reviewing and don't block things up unnecessarily.
AI agents like frequent checkpoints because the git diff is like a form of working memory for a task, and it makes it easy to revert bad approaches. Agents can do this automatically so there isn't much of an excuse not to do it, but it does require some organization of work before prompting.
+100 to this. My job should be thoughtfully building the solution, not playing around with git rebase for hours.
Suddenly rebasing a stack of branches becomes 1 command.
A trivial example would be adding the core logic and associated tests in the first commit, and all the remaining scaffolding and ceremony in subsequent commits. I find this technique especially useful when an otherwise additive change requires refactoring of existing code, since the things I expect will be reviewed in each and the expertise it takes are often very different.
I don't mind squashing the branch before merging after the PR has been approved. The individual commits are only meaningful in the context of the review, but the PR is the unit that I care about preserving in git history.
Maybe it's just a me problem, maybe I need to be more disciplined. Not sure but it catches me quite often.
One technique I use when I find that happening is to check out a clean branch, and first make whatever structural change I need to avoid that rabbit hole. That PR is easy to review, because it doesn't change any behavior and there are tests that verify none of my shuffling things around changed how the software behaves (if those tests don't exist, I add them first as their own PR).
Once I've made the change I need to make easy, then the PR for the actual change is easy to review and understand. Which also means the code will be easy to understand when someone reads it down the line. And the test changes in that PR capture exactly how the behavior of the system is changed by the code change.
This skill of how to take big projects and turn them into a series of smaller logical steps is hard. It's not one that gets taught in college. But it lets us grow even large, complex code bases that do complex tasks without getting overwhelmed or lost or tangled up.
Maybe there's a partial solution if I can keep those commits clean and separate in the tree. And then when I'm done reorder things such that those all happen as a block of contiguous commits.
This is a feature. I would infinitely prefer 12 PRs that each take 5 minutes to review than 1 PR that takes an hour. Finding a few 5-15 minute chunks of time to make progress on the queue is much easier than finding an uninterrupted hour where it can be my primary focus.
> - The of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
It increases it a little bit, sure, but it also helps keep things focused. Reviewing, for example, a refactor plus a new feature enabled by that refactor in a single PR typically results in worse reviews of either part. And good tooling also helps. This style of code review needs PRs tied together in some way to keep track of the series. If I'm reading a PR and think "why are they doing it like this" I can always peek a couple PRs ahead and get an answer.
> - You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
This is a tooling problem. Git and Github are especially bad in this regard. Something like Graphite, Jujutsu, Sapling, git-branchless, or any VCS that supports stacks makes this essentially a non-issue.
the point is not queue progression, it is about dissemination of knowledge
one holistic change to a project = one PR
simple stuff really
> A big queue of PR's for reviewers to review
Yes, yes please. When each one is small and understandable, reviewers better understand the changes, so quality goes up. Also, when priorities change and the team has to work on something else, they can stop in the middle, and at least some of the benefits from the changes have been merged.
The PR train doesn't need to be dumped out in one go. It can come one at a time, each one with context around why it's there and where it fits into the grander plan.
> The [totality] of the feature is split across multiple change sets, increasing cognitive load (coherence is lost)
A primary goal of code review is to build up the mental map of the feature in the reviewers' brains. I argue it's better for that to be constructed over time, piece by piece. The immediate cognitive load for each pull request is lower, and over time, the brain makes the connections to understand the bigger picture.
They'll rarely achieve the same understanding of the feature that you have, you who created and built it. This is whether they get the whole shebang at once or piecemeal. That's OK, though. Review is about reducing risk, not eliminating it.
> You end up doing work on branches of branches, and end up either having to become a rebase ninja or having tons of conflicts as each PR gets merged underneath you
I've learned not to charge too far ahead with feature work, because it does get harder to manage the farther you venture from the trunk. You will get conflicts. Saving up all the changes into one big hunk doesn't fix that.
A big benefit of trunk-based development, though, is that you're frequently merging back into the mainline, so all these problems shrink down. The way to do that is with lots of small changes.
One last thing: It is definitely more work, for you as the author, to split up a large set of changes into reviewable pieces. It is absolutely worth it, though. You get better quality reviews; you buy the ability to deprioritize at any time and come back later; most importantly for me, you grasp more about what you made during the effort. If you struggle to break up a big set of changes into pieces that others can understand, there's a good chance it has deeper problems, and you'll want to work those out before presenting them to your coworkers.
Use jujutsu and then stacking branches is a breeze
Of course that won't work for all projects/teams/organizations. But I've found that it works pretty well in the kinds of projects/teams/organizations I've personally been a part of and contributed to.
This shouldn't matter unless you are squashing commits further back in the tree before the PR or other people are also merging to main.
If a lot of people are merging back to main so you're worried about those causing problems, you could create a long life branch off main, branch from that and do smaller PRs back to it as you go, and then merge the whole thing back to main when your done. That merge might 2k lines of code (or whatever) but it's been reviewed along the way.
I don't necessarily disagree with you. Just pointing out that there are ways to manage it.
But for the company, having two people capable of working on a system is better than one, and usually you want a team. Which means the code needs to be something your coworkers understand, can read and agree with. Those changes they ask for aren't frivolous: they are an important part of building software collaboratively. And it shouldn't be that much feedback forever: after you have the conversation and you understand and agree with their feedback, the next time you can take that consideration into account when you are first writing the code.
If you want to speed that process up, you can start by pair programming and hashing out disagreements in real time, until you get confident you are mostly on the same page.
Professional programming isn't about closing tickets as fast as possible. It is about delivering business value as a team.
If you don't have feature flags, that is step one. Even if you don't have a framework, you can use a Strategy or a configuration parameter to enable/disable the new feature, and still have automated testing with and without your changes.
For sure if you can say LGTM without even looking at anything it doesn't make much sense
>His example PR[0] adds just 152 lines of code, removes 2 lines, but uses 13 thoughtful commits.
>While some developers might understand those 152 lines from the final diff alone, I couldn't confidently approve it without the commit story.
This is ridiculous!
You absolutely can and should review a PR without demanding its "commit story."
Go read the PR under discussion here.[0] There's nothing about it that's hard to understand or that demands you go read the 13 intermediate steps the developer took to get there.
The unit of change in a code review is the PR itself, not the intermediate commits. Intermediate commits are for the author's benefit, not the reviewer's. If the author rewrote the code in FORTRAN to help them understand the problem, then converted it back to the codebase's language, that's 100% okay and is not something the reviewer needs to care about.
The PR should squash the individual PRs at merge time. The linked PR[0] is a perfect example, as the relevant change in the permanent commit history should be "Measure average scheduler utilization" and not "Collect samples" or "Mock sampling."
When you need to communicate extra context outside of the code, that should go in the PR description.[1] Your reviewer shouldn't have to go scour dozens of separate commit messages to understand your change.
>How do you create a PR that can be reviewed in 5-10 minutes? By reducing the scope. A full feature should often be multiple PRs. A good rule of thumb is 300 lines of code changes - once you get above 500 lines, you're entering unreviewable territory.
5-10 minute reviews are so low that it's basically thoughtless rubber-stamping.
If someone spent 5-10 hours making a change, the reviewer should definitely think about it for more than 5 minutes. If all the reviewer is doing is spot checking for obvious bugs, it's a waste of a code review. The reviewer should be looking for opportunities to make the code simpler, clearer, or more maintainable. 5-10 minutes is barely enough time to even understand the change. It's not enough time to think deeply about ways to improve it.
[0] https://github.com/sasa1977/hamlet/pull/3
[1] https://refactoringenglish.com/chapters/commit-messages/
Commits are not important. As an author, you should not waste your time on this. As a reviewer, just ignore them.
I don't even know how could commits only benefit the author; if they're poor they won't help him either, if not as a log of how much work he's done.
Unless you make a PR for every insignificant change, PRs will most often be composed of series of changes; the individual commits, if crafted carefully, will let you review every step of the work of the author quickly.
And if you don't eschew merges, with commits you can also group series of related modifications.
Intermediate commits are just checkpoints of unfinished code. The author knows that they made them and can revert back to them or use git log --pickaxe-S if there's code they saved to a checkpoint and want to recover.
Intermediate commits can have meaningful commit messages if the author chooses, but they could also just be labeled "wip" and still be useful to the author.
It's really easy for a note someone writes to themselves to be useful to that person without being useful to other people. If I write a post-it on my desk that says "beef," that can remind me I need to pick up beef from the store, even though if my co-worker reads it, they wouldn't know what the note is trying to communicate.
>PRs will most often be composed of series of changes; the individual commits, if crafted carefully, will let you review every step of the work of the author quickly.
I don't understand this expectation that an author produce this.
What if the author experimented with a lot of approaches that turned out to be dead ends? Is it a good use of the reviewer's time to review all the failed attempts? Or is the author supposed to throw those away and reconstruct an imaginary commit history that looks like their clean, tidy thought process?
If someone sends me a short story they wrote and asks for feedback, I can give them feedback without also demanding to see every prior draft and an explanation for every change they made until it reached the version I'm reviewing.
The unit of change for a code review is a PR. The intermediate commits don't matter to the reviewer. The unit of change for a short story is the story. The previous drafts don't matter to the reader.
> It's really easy for a note someone writes to themselves to be useful to that person without being useful to other people.
After a few months it will probably be as useful to you as to anyone else; if you only use commits as some sort of help while developing, you might as well just squash them before making a PR.
> What if the author experimented with a lot of approaches that turned out to be dead ends? Is it a good use of the reviewer's time to review all the failed attempts? Or is the author supposed to throw those away and reconstruct an imaginary commit history that looks like their clean, tidy thought process?
Yes, except that it doesn't matter if it's their thought process or not; it doesn't take a ton of time to reorder your commits, if you had some care for them in the first place.
It doesn't make much sense to place failed attempts in a series of commits (and of their reverts), just go back to the last good commit if something was a dead end (and save the failed attempt in a branch/tag, if you want to keep it around).
> If someone sends me a short story they wrote and asks for feedback, I can give them feedback without also demanding to see every prior draft and an explanation for every change they made until it reached the version I'm reviewing.
It's not the individual commits themselves that you need to review (although you can do that, if you place a lot of value to good histories); going through each commit, if they're indeed not just random snapshots but have been made thoughtfully, can let you review the PR a lot faster, because they'll be small changes described by the commits' messages.
yeah for sure you want to squash-merge every PR to main, right?
commits are just commits, there is no moral value to them, there is no "good history" or "bad history" of them, whether or not they're "made thoughtfully" isn't really interesting or relevant
git is just a tool, and commits are just a means to an end
Oh god you're serious?
> git is just a tool, and commits are just a means to an end
To more ends than you realize, probably, if you put some care in making them
commits are what i say they are, nothing more or less
Ok, good is subjective, I guess, so let's say commits with good descriptions, all the information that could be useful to understand what they do (and where appropriate, why), and a limited and coherent amount of modifications in each; in short, commits that are easy to follow and will provide what you need to know if you come back to them later.
THANK YOU for saying this. Reading through the discussion, it almost feels that people refuse to put like 3h over a weekend to actually learn git (a tool they use DAILY), and prefer instead to invent arguments why squash merging is so great.
> It doesn't make much sense to place failed attempts in a series of commits (and of their reverts), just go back to the last good commit if something was a dead end (and save the failed attempt in a branch/tag, if you want to keep it around).
I agree that failed attempts are bad to have as code history. If you reasonably split your commits, the commit message has ample space to document them: "Used approach X because... Didn't use approach Y because..."
When I said my top preference for AI usage, by far, would be to eliminate human code reviews, the response was basically, "Oh, not like that."
Vision should not be AI code, but it should be AI beyond code.
/s
You end up creating more work for the reviewer, and most people just simply won't do the work of a proper review. You also don't have the advantage of any CI or tests running across the entire set of changes, so you also have separate CI reports to review. All this adds up for more places for bugs to hide or happen. All the same risks are still there, and you've also added a few more points of failure in the splitting process.
And for what? To end up, most likely merging in one PR after the next, for a feature that should just be all logically grouped together, or just squashing and merging the PR's together anyway.
Usually what I do is check out their last PR, figure out what I want to say, and then identify the appropriate place to leave a comment in their stack of PRs. Which is a lot more work for me. And this assumes that they’ve even finished all their PRs instead of expecting them to merge in one at a time
Get your potential code reviewers involved before you even start coding. Keep them abreast of the implementation. When it comes time for a review they should be acquainted with your work.
As the PR author it should be your job to: * Self Review the PR * Ensure you adhere/fulfill to all the expectations and requirements a PR should have before it’s pulled out of draft * That all pipeline test pass * That for a given request X there is test that validate X, Not X , and edge cases of X and are ran in the pipeline. * Has a clear description of what you’re changing/adding/removing, why, how, and the rollout plan , roll back plan , & the risk level.
The peer review process should make the reviewer engage in a rubber duck process to review their code , loop the team in for changes that can change their mental model of how a system they own works, and to catch things that we might not catch ourselves.
Not to the mention the security implications
If you don't do standups, just do the same thing ad-hoc on the team chat channel.
Larger tasks and discussions are probably the purview of RFCs.
If the first time your reviewer sees what decisions you've made is when the review happens, then of course it will be overwhelming if the merge request is large.
If you keep your reviewer in the loop, and have bounced implementation ideas off of them, then the review basically just becomes a sanity check.
Some teams and code are pretty much unreviewable and the best thing is to add CI for simple tests and lgtm if there is no glaring mistakes.
My team only has 3 including the manager, so, eh, each one holds a lot of knowledge that only he or she knows. Documentation? Yeah that’s a good idea, but I don’t have time to read them because “We want to ship as fast as possible”. So I just put up a precommit for testing and plug in the same tests for the CI and call it a day. If you pass the CI I’ll take a cursory look and LGTM.
Some code is unreviewable. I work as a DE and it’s all business logic entangled. Why is there a specific id excluded? What is the purpose of this complex join? I mean, the reviewer is not supposed to know everything, right? So the best thing, again, is to only look at the technical things (is the join done properly), let the CI figure out the weird stuffs and LGTM.
Uhm he kind of is, if you want a proper review...
It takes two seconds to write `EVIL_IDS_THAT_STOLE_OUR_LUNCH_MONEY=[1] [...] NOT IN EVIL_IDS_THAT_STOLE_OUR_LUNCH_MONEY` instead of `NOT IN [1]` and then not hate your past self six months from now when you have to figure out why something has been excluded.
If some code can't be understood today, it's not going to be able to be understood when someone comes to modify it. Maybe in your domain all the code is write-once-read-never, but for those of us maintaining enterprise software that isn't an option.
We absolutely expect our reviewers to fully understand the code. If you want a quick shortcut to make that cultural change, have any bugs in code go to the code reviewer instead of the original author. You will find people start taking code reviews a whole lot more seriously.
I suspect you don’t work as a DE or your company has top notch culture. I have never worked in any companies (including some big ones) that encourage a full understanding of any code submitted to PR review.
i know that lots of orgs don't do it this way, but it's really important to make this point very clear: those orgs are wrong, and pathological, no matter how many of them there are
> [...] Telling a Story with Commits [...]
> [...] it should take the average reviewer 5-10 minutes [...]
Jane Street code review system kinda solves this problem by
- making each commit a branch,
- stacking branches on top of each other (gracefully handling rebases and everything that comes with it), and
- reviewing one iteration at a time
So one reviews single commits independently (takes probably around 5mins), and "forces" the reviewer to re-live the story that led to the bigger diff.
I do not work at Jane Street but I frequently find myself pondering on how broken the common code review system/culture is. I've heard of tools like graphite.dev that build on top of git to provide a code review system similar to the Jane Street one, but I'm not an active user yet (I just manually stack PRs, keep them small and ask for review to one at a time to my colleagues, and handle the rebasing etc. manually myself for now).
jj[1] is helpful for this. If you have a chain of commits like A -> B -> C, and you make a change to A, the rest of the chain is automatically rebased.
Shared between the implementer and the reviewers, that is, which means design brainstorming, design formalization of some kind (writing the significant aspects down, or recording them in some other way), and a review process.
I should also say: this process doesn't have to be any larger or more heavy-weight than the change itself. And changes that don't have a design aspect can skip it entirely. But this article is talking about building a story with commits, and at that level you're almost surely talking about a significant design aspect.
The actual code review phase for me is more about making sure the checkin is clean and that what I am intending to work on wont get caught up in a conflicting mess. The code review is NOT a recurring opportunity to purity test my teammates. Presumably, the reason they are working with us in the first place is because they already succeeded at this. "Trust but verify" is a fun trope if you are working somewhere the consequences of a mistake are one-way and measured in millions of dollars. However, a bad commit can be reverted in 10 seconds. Builds of software can be easily recreated. Deploying to production is still sensitive, but why get all weird about rapidly iterating through dev or QA environments?
Newspaper reporters are professionals and they still have editors. We've all seen the disaster that results when an author gets too famous to edit. And we don't have to go in and work with their prose later.
Code reviews are where "my" code becomes "our" code: I want my coworkers to feel comfortable with and fully understand and be happy to support the changes I am proposing to our software.
I didn't say this.
There are many forms of extremely valuable feedback that do not involve subjecting your teammates to a ritual of "clean code" every time a PR is submitted.
I picked up this habit from an early teammate (and manager, who eventually went back to just being a teammate because he didn't love being a manager) who recommended it, and in places I've worked where they've had struggles with their review culture, I've had colleagues express to me how much they love that they do this and mention to me that they've sometimes started asking other teammates to do it for certain changes (e.g. "some of this code looks like it might have gotten moved around without changing, but it's not obvious from the diff, do you think you could go through and note wherever that happened?").
At the end of the day, teams will function best when there's mutual good faith and respect for each other's time. (Obviously some teams are lacking this to various degrees, but at that point I don't don't think code review is really the larger problem, but just symptom of the larger underlying dynamic that either needs to somehow be addressed or the team will never work well). Recognizing where you can save your team time overall by spending some of your own is a pretty useful with that in mind, and code review ends up having quite a lot of low-hanging fruit in this regard both because the context that the PR author has tends to make the amount of effort needed to preemptively help the reviewers understand things is quite low compared to the reviewer needing to ask, and because the return on time spent by the author scaling with the number of reviewers.
Also, code review should be ego free: 1) criticize the code, not the author 2) don’t be too attached to code written, the objective is the product and not number of LoC contributed 3) it’s okay to start from scratch after learning about a better approach, or even make more than one and compare approaches.
Where most teams fail is treating it as a gatekeeping process rather than context sharing, make PRs too small to be meaningful or only waste time arguing about code style and other minutiae.
There are no silver bullets or magical solutions, but this is as close to one as I've ever seen. A true "best practice" distilled from the accumulated experience of our field, not from someone with something to sell.
It depends very much on your coworkers, unfortunately. When a team is all pulling in the same direction, and is kind and constructive and rigorous, code reviews can be awesome.
In some companies, especially ones with stack ranking where one person doing better means someone else does worse, it is easy for them to go horribly awry and become an ordeal of a bug hunt. The obvious solution is to not work at horrible companies that pit engineers against each other, but when that describes some of the biggest employers in the industry it's easier said than done.
my commits include lots of false-starts that get abandoned and "i need to commit this interim state because i deprioritized this and will come back later".
the sequence of events that i used to build the thing isn't necessarily the best sequence of events to tell the story of what the final thing is.
sometimes you can get a good story by stacking PRs, but if the stack gets too deep you can end up with some rebase nightmares.
Let main be the story of how code got from point A to B, and PRs be the story of how each incremental step was made.
I guess I was worried that nobody would want to read the commits, but I really like the thought that what I've been providing is a simple narrative thread to help guide a reader through my train of thought.
1) Each PR (even if it's part of a larger whole) can be committed and released independently. 2) Each PR has a description of what it's doing, why it's doing what it's doing, and if it's part of a larger whole, how it fits into the broader narrative. 3) PRs are focused - meaning that QOL or minor bugfixes should not be part of a change that is tackling something else. 4) PRs are as small as possible to cover the issue at hand. 5) All PRs are tested and testing evidence is presented. 6) When a PR is committed to master, the final narrative in step 1) is the Git commit, along with the testing evidence, includes the JIRA ticket number in the headline, and the reviewer in the body of the git commit.
This way we have a clean, auditable, searchable history with meaningful commit history that can help reconstruct the narrative of a change and be used as a guide when looking at a change in, say, a year.
Also find doing it like this either incredibly hard or have to do a ton of git magic after I'm done to get commits into this state which is very frustrating.
I think it might be the codebase I work on but who knows.
It is very telling in the article itself there is a screenshot of the commits tab in the PR workflow that many don't realize even exists and/or never think to use.
In the Files tab the commit picker has gotten better in recent years, but it is still overly focused on selecting ranges of commits over individual ones, and there's no shortcuts to easily jump Next or Previous commit, you have to remember your place and interact with the full pulldown every time. Also, it's hard to read the full descriptions of commits in the Files view and I find I often have to interrupt my flow to open the commit in another browser tab or flip back and forth between the Commits tab and the Files tab in the PR. The Commits tab also defaults to hiding most of the commit descriptions, so it's still not a particularly great reading experience.
It feels like a bit of a bad feedback loop that GitHub's UI doesn't make commit-by-commit reviewing clean/easy because GitHub themselves don't expect most developers to write good commits, but a lot of developers don't write good commits today simply because GitHub's PR interface is bad at reviewing individual commits and developers don't see as much of a point in it if they aren't going to be reviewed in that way.
The PR should be the smallest unit of integration (this works and builds and is ready to merge to the next steps), but the commit is the smallest unit of any progress at all. Ideas don't come fully formed and ready to compile. Progress sometimes includes back tracks and experiments. Good commits say things "hey, I learned from this thing that wasn't working and that's what pushed me into this next direction". They document the journey, why a specific path was taken, what obstacles were in the way, what other paths were explored and dismissed.
Some PR authors can capture a lot of that in a PR description as well, but commits tie that to specific context of the code at a point in time in ways that a PR description often can't, without linking to commits (in which case the commits again speak for themselves).
But yes, not everyone can or will write good commits. I see that partly as a tooling failure because our tools themselves like GitHub PRs don't encourage it, often fail to reward it. I've seen plenty PRs full of commits named only "WIP" and "Fix" and "oops", but the best PRs tell a story in the commits, have meaningful descriptions on each commit. I would love for our tools to encourage more of those because I think they are better PRs; often easier to review PRs and enough good PRs like that form a string of documentation that you can search through (through git blame and git log if nothing else, but that's still a lot of useful research data). (If you keep that data, I know a lot of people like squash merges because they distrust the git DAG and how many tools show ugly or hard to read "subway diagrams" for it instead of simpler collapsible views. But that's another long conversation.)
some people treat commits as meaningful units of independent review, and some people treat them as savepoints and the PR as the only meaningful unit of review, it's a distinction of process, not purity -- both approaches are totally fine, one is not better than the other
git and commits and prs are means, not ends
Sounds nice but I’m sure that there are projects out there that are like constantly being in the trenches, testing in prod and the original developers being long gone, where the devs manage to barely keep alive this WH40kesque monstrosity. Where all of the code has a lot of incidental complexity that will just never be resolved.
Alternatively, there’s probably countries and companies out there where that attitude would get you flagged as someone who is slowing down the velocity and get you fired - because there’s an oversaturation of devs in the local market and you’re largely an expendable cog.
Surely there’s other methods, formal or LLM driven that could be used to summarize changes and help explore everything bit by bit (especially when there’s just one big commit that says “fix” as a part of the PR).
Sometimes I get too caught up with coming up with contrived scenarios where “best effort” just will never happen but I bet someone out there is living that reality right now.
Especially if the software matters at all.
I purposefully find jobs in fields like healthcare and physics and finance where it actually matters that the software works. Right now, if there is a bug in my code people could die.
And in that case, there are worse things than being fired.
If people do find themselves in that situation, the best answer is "unionize". The second best answer is to work with your coworkers to adopt better practices (it is very unlikely that they are going to fire you all all at once). And the third best answer is to do the job well, regardless of what is going on around you, and if you get fired you get fired.
That's a pretty nice way to view things! I've been lucky enough to mostly be in environments where I can work towards that (even if there is the occasional enterprise legacy project that can feel as a maze).
But at the same time, even in better circumstances, people can get lazy, I certainly do, and are sometimes also set in their ways. In those cases, I'll take any automated tools that will make it easier to do things in more organized and better ways - e.g. automatic code formatting with a linter for consistency, or various scripts to run on the CI server, to check codebase against a specific set of rules (e.g. "If you add a reusable component, you should also add it somewhere in the showcase page").
I wonder what automation and tooling could be used to manage the complexity of some pull requests, even when people have deadlines or sub-optimal understanding of the whole codebase, or any number of other reasons that make things more difficult. Things like better code navigation tools (even in the web UI), or more useful dependency or call graphs, a bit like what SourceTrail did back in the day, before going bust: https://github.com/CoatiSoftware/Sourcetrail
They, the reviewers, have to eat it, not me.
Some reviewers want wonder bread bread with a slice of spam.
Some want hand-made spreads with home-grown vegetables.
Is easier to fix a sandwich than a wedding cake.
The take-away
Don't do wedding cakes.
git add --patch
...is your friend if you want to leave all your changes unstaged for awhile then break it out into multiple commits later. * squash everything I’ve done into one commit
* create a new branch off main/master that will be the “first commit”
* cherry-pick changes (easy from some git guis) that represent a modular change.
* push and make an MR from the new branch
* rebase “the big commit” on top of the partial change.
* wash, rinse and repeat for each change, building each MR off its requisite branch.
The squashing part is vital because otherwise you enter merge conflict hell with the rebase. * squash into one commit
* git reset HEAD~1
* git add -p
* git commit -m commit1
* repeat until no changes are left
* add any file deletions/additions
I use this because you can have several commits marked e.g. "commit1". Then you make a final interactive rebase to squash them together.If you’re interested in trying these strategies anyway, does your editor of choice have an inline “git blame”? In IntelliJ, I can see who and when committed the most recent change in the line around the one I’m working on.
It doesn’t resolve the “which files have I worked on” issue; but it might help the others? Not as nice as a different colored line like uncommitted code would otherwise be highlighted, but it could be enough of a step in that direction?
Seems like it would be even better if VS Code provided a way to highlight all lines changed relative to a particular commit like the start of a branch. Maybe it's worth filing a feature request?
(I don't use VS Code this way so I'm assuming it doesn't already have this.)
It does require a bit of a paradigm shift sometimes to not rely as much on seeing all cumulative changes for the ticket highlighted as you code, and instead compartmentalize your immediate view to the current commit's task, but often the above 2 alternatives help suffice. Of course, you did mention that you'll commit stuff you're not likely to touch again, which helps a lot too
Write everything in one go, and then afterwards rework the commits to tell the story you want to tell. This does require getting very comfortable with git rebase, but I can absolutely recommend it.
Using this technique, I've managed to merge several large refactors lately with no issues, and the reviewers didn't hate me.
git is a means, not an end
code review is about the code as a unit whole, not the steps along the way!
I'm able to work on long lived branches without getting into conflict hell because when I rebase, I'm only dealing with small conflicts.
I'm able to slim down initially large changes into smaller total diffs because I realised one change wasn't actually necessary, but only because I took the time to reflect on the code and separate the concerns
Being able to separate your code into smaller units is a really great tool, and helps you really understand your own code changes in a new light. Amusingly despite me often rewriting the same code 3 times, I feel like I've never been more productive (and no, I don't use any LLMs)
Meaning, I can keep committing while also able to see the full changes evolve.
In practice:
- smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
- nobody reads intermediate commit messages one by one on a PR, period. I worked on a team where the lead was adamant about this and started to write messages in the vein of “if you’re reading this message, I’ll give u $5”. I never paid anyone a dollar. Don’t waste your time writing stuff for no one.
- “every commit must compile” - again, unnecessary overzealousness. Every commit on the MAIN branch definitely should compile. Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
You want PRs because they help others absorb what you’re doing (they’ll have to read that same code sooner or later). You don’t want to create a performance theater.
PRs are emails to your team and to your future self.
Framed in that context it's easier to carry the correct tone and think about scoping / what's important.
---
> pedantically apply DRY to every situation
I swear DRY has done more damage to the software industry from the developer side than it has done good because it has manifested into this big stick with which to bludgeon people without taking context into account.
This should be commits though. Typically, developers would look for clues in this order:
code -> code comment -> commit message -> PR text -> external document
So commit messages puts the information closer to the user. One hop doesn't seem much, but the time saved adds up as you go.
Also, as some other reader mentioned anecdotally, PRs may not be there forever. E.g. your team may migrate to a new platform PR text and reviews were left behind.
If you decide to do merges without squashing, then yes, you gotta have to have more hygiene on each individual commit. It creates a lot of unnecessary friction and it's guaranteed to be slower (devs can't use commits as checkpoints/savepoints on their work, but rather each commit becomes a fully fleshed out "intermediate final state"). The only situation where I see this making sense is if you share work on a branch with other engineers (which is also a bad idea).
But they can! In git you can do whatever you want with your local/remote working branch. And after you're done it's pretty straightforward to massage it into a coherent series of commits (especially if you had been working with that in mind).
> each commit becomes a fully fleshed out "intermediate final state"
This is really a team decision. You can allow intermediate commits to e.g. fail the tests, and add a tag to your main/master after each merge. Then you know that only the tagged commits are guaranteed to be fully functional.
Why waste time? Just squash and merge, you have a single commit and it WORKS. Intermediate messages disappear and you have a single, atomic rollback point on your main branch
> You can allow intermediate commits to e.g. fail the tests, and add a tag to your main/master after each merge. Then you know that only the tagged commits are guaranteed to be fully functional.
OR… squash and merge. Block merging with tests and compilation passing
For anything in tech, there’s the frictionless way and the busywork way. Both of your examples are busywork that’s completely unnecessary if you just… squash and merge
The best process is the process nobody needs to remember to do shit for it to work
You always have the PR discussion to refer to, until you move to a different platform to cut costs.
You can always ask the author of the code, until they have left the company.
Everywhere I've worked the past few years squashes PRs on merge with the PR becoming the commit title + message so the context lives on in the git history.
Indeed! I've found many point on this discussion answered by the linux kernel idea of mailing lists where a change is discussed then approved, often with feedback acknowledged
* Yes, TDD on production code is nice in theory, but it doesnt work in my case.
* Yes, short PRs are nice in theory, but it doesnt work in my case.
In every case, as far as I can see, it meant "It does work, I just dont know how to do it".
When I say "if you dont think it works in your case, come to me, Ill show you" they often demur and I end up with a huge PR anyway.
In practice I dont think ive ever seen a long PR that wouldnt have benefitted from being strategically broken up, but every other day I see another one that should have been.
Parent said something more along the lines of "they don't work in every case, and trying to force it in every case is misguided".
I agree that too big is more common than too small with respect to PR size, but you aren't putting forward much of an argument against parents "there are no absolutes" argument by straw manning them.
Im fairly sure that I could explain how to break up any long PR in a sensible way. Parent thinks couldnt be done, so do you - what is an example?
The only exception i can think of is something where 99.9% of the changes are autogenerated (where i wouldnt really be reading it carefully anyway, so the length is immaterial...).
> I'm fairly sure that I could explain how to break up any long PR in a sensible way. Parent thinks couldnt be done, so do you - what is an example?
To me, when I meet experts in any field, the quality that stands out isn't that they do everything to expert level, it's that they get everything done as they said they would. Sometimes that means big PRs, because that's the environment created, and the expert finds the way to get the job done.
I'm not doubting you _could_ break up any PR into a shorter one. But that's kind of the point of an expert: they recognise what makes sense to do in reality, rather than just doing something because it's best practice and expecting everyone else to do the same.
They ultimately get the thing done how they said they would.
This whole chain is like arguing on how tidy your desk should be. Some people like it fastidious to the nth degree. Some people prefer a little mess.
In neither case does that preference really matter much compared to all the other things a real job entails.
This. It saves everyone's time.
I have seen plenty of huge PRs which were more trouble than they were worth to break up after discovery. At some point it becomes like unbaking a cake. It's a trade off.
Ive just never thought when I saw any of them that there wasnt a more practical way to get there with a bunch of smaller PRs.
Unlike dealing with an already existent large PR this isnt really a trade off thing - there are basically almost no circumstances when it is preferable to review one 1000 line code change instead of 4x self contained 200 line changes.
Not couldn't - but shouldn't, such as when there's tight coupling across many files/modules. As an example, changing the css classes and rules affecting 20+ components to follow updated branding should be in one big PR[1] for most branching strategies.
Sometimes it's easier to split this into smaller chunks and front-load reviews for PRs into the feature branch, and then merge the big change with no further reviews, which may go against some ham-fisted rule about merging to main. Knowing when to break rules and why, ownership, and caring for the spirit of the law and not just the letter are what separates mid-levels from seniors.
1. Or changeset, if your version control system allows stacking.
How would you do this? You'd either
1. Create N pull requests then merge all of them together into a big PR that would get merged into mainline at once 2. Do the same thing but do a bit of octopus merging since git merge can take multiple branches as arguments. Since most source control strategies are locked down, this isn't usually something that I can tell my juniors to do
The point of breaking things down like this is to minimize reviewer context. With bigger PRs there's a human tendency to try and hold the whole thing in your head at once, even if parts of the pull request are independent from others.
This principle is much more important than some rule that says "Merges to main should not be more than 150 lines long". Sticklers for hard-and-fast rules usually haven't achieved the experience to know that adhering to fundamental principles will occasionally direct you to break the rules.
This can be done by allowing a flag in the commit message that bypasses the 150 line long (or whatever example) rule in the CI that enforces it. Then the reviewers and submitter can agree whether or not it makes sense to bypass the rule for this specific case.
In many cases like this, it's okay to override a rule if the people in charge of keeping the codebase healthy agree it's a special case.
code review is meant to take time
No, this is a pretty classic example of where you can break up the work by first refactoring out the tightly wound coupling in one PR before making the actual (now simpler/smaller) change in a second PR.
- Change everything all at once. This creates a large PR. - Split it up into multiple small PRs. Now your individual PRs don't compile, and make less sense on their own. - Create a new class, and then split up multiple PRs that transition code to use the new class, and then finally a PR to remove the old class. This is more work for both the author and the reviewer and the individual PRs are harder to understand on their own.
Does seem like you could namespace classes in to versions. Then it's much clearer which version of a class a caller is using.
Of course, your mileage may vary; this technique is certainly not suitable for all breaking changes or all workfkows.
Is it possible to break them up? Sure. Is it better to do so? I don't think so.
Also, for clarity, neither myself, nor op, every said couldn't be done.
A group of features that only combined produce a measurable output, but each one does not work without the others.
A feature that will break a lot of things but needs to be merged now so that we have time for everyone to work on fixing the actual problems before deadline X as it is constantly conflicting every day and we need to spend time on fixing the actual issues, not fixing conflicts.
Real example, we do PR reviews because they're required for our audit and I'm of the opinion that they're mostly theater. It's vanishingly rare that someone actually wants a review rather than hitting the approve button and will call it out specifically if they do. Cool. This means you can't count on code review to catch problems, discipline doesn't scale after all. So instead we rely on a suite of end to end tests that are developed independently by our QA team.
Obviously it has to be a pure refactor, entirely isolated from functional changes but there are plenty of similar cases where doing it once is the least effort.
Focus on customer outcomes, and keep main clean.
>Don’t waste your time writing stuff for no one.
I've thought about that as I continue to write them. I think I can justify it by saying they are mostly for me. Can I describe what I'm trying to do with a specific push into a few items. It let's me reflect if I'm waiting too long between commits or if my ideas are getting too spread apart and really should be in two different branches that each have their own PRs. Then there is the rare case on a slower project where an item gets deprioritized and I come back to it weeks or even months later. Having the messages help me catch back up to speed.
As such, I find the 20 seconds or so to type out 1 to 2 sentences to be worthwhile, even if the ones reviewing the eventual PR never check. I'm also not above throwing in a "ditto" or "fixed issue" when a single commit really is that small or insignificant.
>“every commit must compile”
I agree with your take this is overzealous, but to expand upon my previous point, if I know a commit on a branch won't compile (say just had something else come up and need to swap focus for a few days), then I'll try to make sure I call that out in my last message just in case anyone else happens to get put on the project.
If I were to summarize my approach, treat PR messages seriously, but treat branch commit messages like sticky notes that will likely end up in the trash by week's end.
I clean my history so that intermediate commits make sense. Nobody reads these messages in a pull request, but when I run git blame on a bug six months later I want the commit message to tell me something other than "stopping for lunch".
> pedantically apply DRY to every situation or forcing others to TDD basic app
Sure, pedantically doing or forcing anything is bad, but in my experience, copy-paste coding with long methods and a lack of good testing is a far more common problem.
You may be 100% correct in your particular case, but in general if senior devs are complaining that your code is sloppy and under-tested, maybe they aren't just being pedantic.
This is a false dichotomy and an unproductive thing to focus at.
Experienced engineers know when to make an abstraction and to not. It is based in the knowledge about project.
Abstarct well and don't do compression. Easy said, and good engineers know how to do it.
I don't always check if commits are buildable, PR should be, because that is what is merged to master and tip of master should be buildable.
I do! I find it the easiest way to review code when the author has taken the time to structure it in that way. I'm lucky to work with some great people.
I do. Especially if the author is competent.
That said, empirically, you're correct most people don't.
However, that said, I think changing the culture rather than throwing away the practice would be a better response.
Reading and reviewing clean history is really so much nicer. I'd also argue that actually making your history clean (as opposed to theatrically and thoughtlessly making small commits, say) forces you as the author to review it more carefully.
A few exceptions:
1. When refactoring often your PR is "do an enormous search and replace, and then fix some stuff manually". In that case it's way easier to review if the mechanical stuff is in a separate commit.
2. Similarly when renaming and editing files, Git tracks it better if you do it in two commits.
3. Sometimes you genuinely have a big branch that's lasted months and has been worked on by many people and it's worth preserving history.
Also I really really wish GitHub had proper support for stacked PRs.
Another instance is a build system rewrite. There was a (short) story of the new system itself and then a commit per module on top of that. It landed as 300+ commits in a single PR. And it got rebased 2-3 times a week to try and keep up as more bits were migrated (and new infra added for things other bits needed). Partial landing would have been useless and "rewrite the build system" would have been utter hell for both me developing and anyone that tries to blame across it if it hadn't been split up at least that much.
Basically, as with many things in software development, there are no black-and-white answers here.
A perfect illustration of a backwards mindset. If this made sense then the standard or least common denominator PR tool would work better with many small PRs, which here also means they must be able to depend on each other. (separate PRs!!!) So is it?
> Also I really really wish GitHub had proper support for stacked PRs.
No. It doesn’t even support it.
So how does this make sense? This culture of people wanting “one PR” for each change, and then standard PR tool that everyone knows of doesn’t even support it? What’s the allegiance even to, here? Phabricator or whatever the “stacked” tools are?
It’s impressive that Git forge culture has managed to obfuscate the actual units of change so much that heavyweight PRs have become the obvious—they should be separate PRs!—unit of change... when they don’t even support one-change-then-another-one.
Gitlab kind of supports it - if your second PR's target branch is the first PR then it will only show you the code from the second PR and it will automatically update the target branch to master when the first one gets merged. I wouldn't say it's first class support though.
Sapling sort of has support for making it work on GitHub: https://sapling-scm.com/docs/addons/reviewstack/
And there was some forge that supports Jujutsu that has proper first class support, but I can't find it now.
Anyway it's a very useful workflow that lots of people want and kind of insane that it isn't well supported by GitHub.
To be fair I can't remember the last time GitHub introduced any really new features. It's basically in maintenance mode.
I think that whether clean history has a point, really depends on how deep are you refinement sessions. And perhaps a bit on the general health of your codebase.
If you don't do refinement with your editors open and grind tickets into dust, there will be side-changes adjacent to each PR which are not directly related to the ticket. These are better to have their own commit (and commit message).
Yes, we griped that GitHub would not allow us to merge individual commits, but if it was ever urgent or helpful to do so, we cherry-picked a commit into a separate PR.
Everyone's workflow is a bit different, and it can be hard to redirect organizational inertia. But without a doubt, reading a clean commit history is a pleasure.
You can have both with git and it's not even hard. Unfortunately it seems many people pride themselves in what little they know of git. I'm not being sarcastic, I've read people say this almost word-for-word.
commits mean precisely what their author intend them to mean, nothing more
if you squash-merge every PR then history is clean where it matters
If I want that granularity, I'd go read the original PR and the discussion that took place.
I'm cool with other reasonble approaches though, but I'm pretty over pointless hoops because someone says so.
Together with backing up your work.
Sure you can keep amending your last commit but whenever you detour to another problem in the same PR that turns into a mess.
Easier to just treat the PR as the atomic unit of work and squash away all that intermediate noise.
It also ensures that CI will pass on every commit on the main branch.
This is why commits are often noise. If people are using commits well, they tell a story. The fact that people often use the tool wrong certainly begs some criticism of the tool, but when used correctly commits are certainly worth looking at one by one
What do you consider correct usage of git, and why? In this very discussion, I can see at least two distinct purposes that, more often than not, are mutually exclusive:
- To "tell a story" for other people
- To checkpoint units of work as individual perceives them, helping them deal with interruptions (which include running out of work day).
Storytelling is a skill in itself, doing it is a distinct kind of extra work, so you can't really have people use git for both at the same time. Which is where the whole commit history management idea comes from - it's to separate the two into distinct phases; first you commit for yourself, then you rework it to tell a story for others.
Sometimes I go down a dead end, reverse out, and leave a comment about why a different approach would be a dead end. I (and others) don't need a record of the work I did on that path, just the synthesis (an explanatory comment)
There are multiple systems for structuring commits, but the commit message body content approximates to the same in all of them. The classic advice is https://tbaggery.com/2008/04/19/a-note-about-git-commit-mess... , but I find https://www.conventionalcommits.org/en/v1.0.0/ useful for looking at the oneline log
To address this point:
> To checkpoint units of work as individual perceives them, helping them deal with interruptions (which include running out of work day).
Yes, commits can be used like this! But once you have a chunk of work ready for review, cleaning up the commit log/history, grouping related changes, and describing them is useful for maintaining the software.
I don't like squash merges personally, though they have their merits. But regardless, I would copy the commit subject/body content into the PR message, which then puts everything into the PR commit also, so technically the granular commits are less relevant when one merges, but occasionally are still useful to refer to
But that's my point exactly. Unless you're exceptionally clear thinker, a story that's natural for you is not very good for anyone else. Your story is optimized for an audience of 1, developed interactively, and meant to help you in the now. The story for the team is meant to help them orient themselves after the fact. Turning one into the other is its own kind of work.
But then different people and teams have different ways of working. VC isn't the whole world. In some projects, I'd make "team story" commits directly, because I used a separate text file to note down my thoughts, and used that to keep me on track. So it's a different way of solving this problem.
Are you so good that you can just one-shot a book from start to finish without any mistakes?
For me commits to a PR are more like me going "ok, this step is stable enough" and continuing with the next one. The code might not compile or be valid or even clean, but it lets me focus on the next step without having 42 staged files cluttering up my brain.
That is the perfect story for the final merge commit or the PR when this nicely crafted story is squashed into nothingness and merged.
So, I'll often just make lots of changes and then commit them all at once with some vague commit message and that's that. Nobody cares. If I want to tell people why the code is the way it is I'll just add a comment.
This is how I work. I have tried to be more disciplined with commits and stuff like that but I find that it just slows me down and makes the work feel more difficult. I also frequently just forget and find myself having made lots of changes without any commits so then I have to retroactively split it up into commits which can be difficult too. So I'd rather just not worry about it, focus on getting good work done and move on rather than obsess over a git history that's unlikely to ever be read by anyone. I realize that's a self-fulfilling prophecy in that it'd be more likely to be read if it was useful and well done but it's not just me. If I was in a team where everyone did it really well I'd try to keep my own work up to par. But usually I'm the one who cares most about how we do things and this just doesn't seem important to me.
Honest question: why do you even use version control? What do you get out of it?
Based on your workflow, you could just as well not use it at all, and create zip files and multiple copies of files with names like `_final3_working_20250925`.
Change history is the entire point of version control. It gives you the ability to revert to a specific point in time, to branch off and work on experiments, to track down the source of issues, and, perhaps most useful of all, to see why a specific change was done.
A commit gives you the ability to add metadata to a change that would be out of place in the code itself. This is often the best place to describe why the change was done, and include any background or pertinent information that can help future developers—including yourself—in many ways. Adding this to the code base in a comment or another document would be out of place, and difficult to manage and discover.
You may rarely need to use these abilities, but when you do, they are invaluable IME. And if you don't have them at that point, you'll be kicking yourself for not leveraging a VCS to its full potential.
> I have tried to be more disciplined with commits and stuff like that but I find that it just slows me down and makes the work feel more difficult.
Of course it slows you down. Taking care of development history requires time and effort, but the ROI of doing that is many times greater.
I encourage you to try to be disciplined for a while, and see how you feel about it. I use conventional commits and create atomic commits with descriptive messages even on personal projects that nobody else will work on. Mainly because it gives me the chance to reflect on changes, and include information that will be useful to my future self, months or years from now, after I inevitably abandon the project and eventually get back to it.
Here is an example from a project I was working on recently[1]. It's practically a blog post, but it felt good to air my grievances. :)
[1]: https://github.com/hackfixme/sesame/commit/10cd8597559b5f478...
Writing a story no one will ever see is not one of them. Write real docs and your PM, QA, SMEs will benefit as well, not only developers who bother to dig thru the history.
Saving progress is useless if your history is a mess and you have no idea what a previous state contains.
> backup files, communicate with others.
You do know that there are better tools than a VCS specifically built for these use cases, right?
> You know, the main benefits of version control?
No, I don't think you understand what version control is for.
You can use a knife to open a wine bottle, but that doesn't mean it's a good idea.
> Writing a story no one will ever see is not one of them.
You won't. I definitely will, and I take the liberty to be as verbose as I need in personal projects.
> Write real docs and your PM, QA, SMEs will benefit as well, not only developers who bother to dig thru the history.
You should write "real docs", but that's not what commit messages are for. They're not meant to be read by non-developers either. And developers don't have to "dig thru the history" to see them. Commits are easily referenced and accessible.
1. allow collaboration
2. Have branching and merging
3. Have diffs between two points in time/branches/tags
4. Allow release tagging
it is enough to work with it. Not to say that a coherent git history is great, but to call it the main point is something else. As that is definitely not how a lot of teams are using git or any version control.
Nope, it works, commit! Tests pass, commit! Push.
You’ve demonstrated you don’t know what version control is for. Cleaning up the past is a peripheral nicety, that is not at all core.
In fact some situations prefer history not be changed at all.
> Honest question: why do you even use version control? What do you get out of it? > Change history is the entire point of version control. It gives you the ability to revert to a specific point in time, to branch off and work on experiments, to track down the source of issues, and, perhaps most useful of all, to see why a specific change was done.
You answered your own question here. I get pretty much all of that stuff. Maybe you get some of that stuff a bit better than I do but I don't really think there's much of a difference. I can still go back in time and make changes etc, I can't necessarily revert every specific small change ever made using git alone but I can easily just make that change as a new commit instead - which is probably faster than scanning through a million commit messages trying to find the one I want to revert anyway.
I can go back to some arbitrary point in time just like you can. My resolution may not be as fine as someone who makes more commits but so what? Being able to go back to an arbitrary day and hour would be timetravel enough for me, I don't need to be able to choose the specific second.
Just to be clear I do make commits and I do try to write descriptive messages on them - I just also try to avoid spending more than a few seconds deciding what to write. That commit you just showed is larger than most of mine. That's a whole PR for me, which is pretty much what I said initally: I'll just do the whole task and commit it all in one go - what I'm not doing is splitting it up into 50 individual commits like some people would want.
I think the primary difference between the two of us is that you write huge commit messages and I don't, aside from that our commits seem very similar to me.
There are many benefits of doing this. When following the output of `blame`, you can see exactly why a change was made, down to the line or statement level. This is very helpful for newcomers to the codebase, and yourself in a few months time. It's invaluable for `bisect` and locating the precise change that introduced an issue. It's very useful for cherry picking specific changes across branches, or easily reverting them, as you mention. It makes it easier to write descriptive changelogs, especially if you also use conventional commits, which nowadays can be automated with LLMs. And so on.
Most of these tasks are very difficult or impossible if you don't have a clean history. Yes, they require discipline, time, and effort to do correctly, but it saves you and the team so much time and effort in the long run.
Ultimately, it's up to each person or team to use a VCS in whatever way they're most comfortable with. But ignoring or outright rejecting certain practices that can make your life as a developer easier and more productive, even though they require more time and effort upfront, is a very short-sighted mentality.
> I think the primary difference between the two of us is that you write huge commit messages and I don't
The commit I linked to is an outlier, and if you see my other commit messages, most are a few sentences long. It's not about writing a lot, but about describing the change: what led to it, why it was implemented in a specific way, mention any contextual information, trade-offs, external links, etc. In that particular case it was a major feature (indicated by the exclamation point in the subject) that changed large parts of the code base, so it deserved a bit more context. I was also feeling a particular way and used the opportunity to vent, which isn't a good place for it, but since this is a personal project, I don't mind it. Although how the programmer felt while writing the code is also relevant contextual information, and I would gladly read it from someone else in order to understand their state of mind and point of view better.
Also, these days with LLMs you can quickly summarize large amounts of text, so there's no harm in writing a lot, but you can never add context that doesn't exist in the first place.
The number of times I look at a change and all I can think is "why did they/I do that" is very very often. Having the answer to that question available saves re learning the lesson that led to the change.
Say you're on a development branch and you added something new, that the Project thinks can be and should be added to the Main branch. By having that addition in its own self-contained commit allows the Project to create a new branch, cherrypick the commit, and merge the branch to Main, without having to pull in the rest of the development branch.
It's of course not really necessary if you're the only person doing all the development, but it's just a good etiquette if you're working with other people in the Project.
When I'm creating a PR in a feature branch, it's my playground. And the PR will be squashed and merged as one clean commit with a clean message.
Nobody wants to see my 50 "argh, forgot this bit" commits, they bring zero value.
And, no, PRs are not necessarily an atomic level of work. While they should contain a single feature, fix, etc., sometimes that work can span multiple commits.
If the PR includes superfluous commits, then they should be squashed into the appropriate commit. Squashing the entire PR when it includes multiple changes is simply a bad practice. It's bad because you lose all the history of how the overall change was done, which will be useful in the future when you need to do a blame, cherry pick, bisect, etc.
It's surprising to me how many developers misunderstand the value of atomic commits, or even what they are. And at the same time, it's exhausting having this discussion every time that happens, especially if there is continued pushback.
I am not against people having their preferred way of using VCS tools. As long as it works for their team, that's fine. But there are certain best practices that simply help everyone in the long-term, including the author, that I'm baffled whenever they're willfully ignored. I can't help but think that it's often done out of laziness, and lack of discipline and care into the work they do, which somehow becomes part of their persona as they gain more experience.
That's what the description field is for. I never, ever inspect the "commits" tab in a PR unless I see some lucicrous number on it. And even then it's just to see what the heck happened.
> If the PR includes superfluous commits, then they should be squashed into the appropriate commit.
This happens on merge if your Github is set up correctly.
> Squashing the entire PR when it includes multiple changes is simply a bad practice.
The bad practice is the PR changing multiple distinct things.
> It's bad because you lose all the history of how the overall change was done, which will be useful in the future when you need to do a blame, cherry pick, bisect, etc.
It's not.
No. The PR description is for describing the overall change, which, again, may include multiple commits. The description can also include testing instructions, reviewing suggestions, and other information which is not suitable for a commit message.
PR descriptions can be edited and updated during the review, which can be helpful. A commit message is immutable, and remains as a historical artifact.
Also, when I'm working on a code base, the last thing I want is to go hunting for PRs to get context about a specific change. The commit message should have all the information I need directly in the repo.
> I never, ever inspect the "commits" tab in a PR unless I see some lucicrous number on it.
And... you're actually proud of this? Amazing.
Have you ever read a descriptive commit message? Do you even know what they look like?
I'm taken aback by the idea that there are developers who would take the time and effort to write a detailed commit message, only for others to not only never read it, but to be proud of that fact. Disgraceful.
> This happens on merge if your Github is set up correctly.
No. This is what I mean about developers not understanding what atomic commits even are. There are commits that will be done during a review, or as ad-hoc fixes, which indeed shouldn't exist when the PR is merged. But this doesn't mean that the entire PR should be squashed into a single commit.
Those useless commits should instead be squashed into the most relevant commit, which is straightforward if you create `--fixup` commits which can then be automatically squashed with `rebase --autosquash`.
But the PR may ultimately end up with multiple atomic commits, and squashing them all into a single commit would nullify the hard work the author did to keep them atomic in the first place.
If you configure GitHub to always squash PRs, or to always create a merge commit, or to always rebase, you're doing it wrong. Instead, these are decisions that should be made on a case-by-case basis for each PR. There are situations when either one of them is the best approach.
> The bad practice is the PR changing multiple distinct things.
Right. I'm sure you enjoy the overhead of dealing with a flood of small PRs that are all related to a single change, when all of it could be done in a single PR with multiple commits. This is easier to review, discuss, and merge as a single unit, rather than have it spread out over multiple PRs because of a strict "one PR-one commit" policy.
All that rule does, especially if you have PR squashing enabled by default, is create a history of bloated commits with thousands of lines of unrelated changes, that are practically useless for cherry picking, bisecting, and determining why a specific change was done, which is the entire point of commits. Good luck working on that codebase.
> It's not.
k.
If you want to communicate with others, write proper docs in a format that won't be lost to time, and are accessible to everyone, not merely investigative developers.
Everything I said has direct benefits for the team, and hence for the company.
> If you want to communicate with others, write proper docs in a format that won't be lost to time
You have a severe misunderstanding of what commit messages are for. They're meant to describe changes that can be used as historical reference by developers. They're not meant to be read by non-developers, serve as replacement for "proper docs", or for general communication.
A VCS history is by definition never "lost to time". It is an immutable record of the development process of the project. If you don't find that useful, choose not to use it to its full potential, and strangely relish in that fact, you might as well use another tool.
Your posts above are dripping in it.
Docs are available to everyone, accessibility in action. You have a severe misunderstanding of what communication is.
There’s no important developer information that should be explicitly and effectively hidden from others. There’s not even a proper search facility, you have to browsing with a lot of background knowledge until you hopefully find something. Newer members won’t have this knowledge.
Code changes, requirements change, often. Info becomes obsolete rather quickly. Projects may last decades. By definition, historical assumptions are inferior. There’s already a mechanical commit record as well.
So yes, buying any important information there is going to be lost to time, and is therefore a waste of it.
s/burying/buying/
The way it appears to me, if there's multiple commits submitted as separate PRs, then maybe the PR wasn't so atomic to begin with.
Until your company switches code repos multiple times and all the PR history is gone or hard/impossible to track down.
I will say, I don't usually make people clean up their commits and also usually recommend squashing PRs for any teams that aren't comfortable with `git`. When people do take the time to make a sensible commit history (when a PR warrants more than one commit) it makes looking back through their code history to understand what was going on 1000% easier. It also forces people to actually look over all of their changes, which is something I find a lot of people don't bother to do and their code quality suffers a lot as a result.
Bisecting on squashed commits is not usually helpful. You can still narrow down to the offending commit that introduced the error but… it’s somewhere in +1200/-325 lines of change. Good luck.
How is this a mistake?
The atomic level of work should be a single, logically coherent change to the codebase. It's not managerial, it's explanatory.
As you work things naturally arise. Over here a reformatted file, over there comments to clarify an old function that confused you, to help the next developer who encounters it. Cleaning, preparatory refactoring that is properly viewed as a separate action, and so on. Each of these is a distinct "operation" on the codebase, and should be reviewed in isolation, as a commit.
Some of these operations have nothing to do with the new feature you're adding. And yet creating separate PRs for each of them would be onerous to your reviewers and spammy. Clean, atomic history lets you work naturally while still telling a clear story about how the code changed, both for reviewers and future developers.
My personal idea is that the code in each PR should be well documented enough for a review, but also for when people join the team and need to learn. Or when a pour soul needs to check upon your code while you are out on a holiday. This personal rule does not apply to all projects, but for bread and butter stuff I tend to go by it and not care about clean commit histories. The cost to reward seems way better.
Sure, and we must make that correspond to the atomic unit that our collaboration tools provide us for reviewing and merging. In Github and similar git forges, that's a PR, not as a commit. A string of atomic changes should be represented as a series of PRs, not a series of commits in one PR, because Github isn't designed to review and merge individual commits.
The "atomic commits" crowd are (in my opinion) coming up with best practices for the tools they wish they had and working against the grain of the tools we actually use.
There is a "commits" tab and next button to quickly go through commits on every PR. It's very easy to use.
All that you mean is most people ignore it.
I think a workflow like this for atomic commits would be nice. tangled.sh supports it for jujutsu¹, and it looks really neat. But the existing code review interface is clearly designed for code review to take place at the MR level.
I do know that when I was using GH regularly on a team where a number of people wrote clean history, the problems you mentioned didn't come up, not that I can recall. So for the 90% case, let's say, you can do clean history on GH and get the majority of its benefits. But yes, I'm sure it's flawed especially in workflows where those types of problem arise often.
You find people how aren't able to craft clean commits and PRs usually thrive in environments in which people are either work mostly alone or in which cooperation is enforced by external circumstances (like being in the same team in a company). As soon as developers many are free to choose whom to associate with and whose code they accept - rules are usually made and enforced.
That's not the situation in a normal corporate environment. You want to reduce total time expended (or total cost, at least). It's going to be cheaper to just have a chat with your coworker when a PR is confusing.
Could you explain this a bit more? I'm having trouble visualizing the end to end process.
1. Someone has what they feel is a complete change and submits a PR for review.
2. The reviewers read part of it, first half looks good, and halfway through they have concerns and request changes.
3. The submitter now has to fix those concerns. If they are not allowed to push an additional commit to do this, how do you propose they accomplish this? Certainly they should not force push a public branch, right? That would cause pain to any reviewer who fetched their changes, and also break the features on GitHub such as them marking which files they have already read in the UI. But if we cannot push new commits and we cannot edit the existing commits, what is the method you are suggesting?
The first time you push, you should have squashed/rebased your changes into a structure that make sense. Atomic commits are best. Could even be a single commit. Sometimes it makes sense to have multiple commits. E.g. - introducing a new API - moving other code to use the new API - deleting the old API
This could also be a single commit. That is really up to you/your team.
And yes, you rebase/squash and force push new commits. Every team I had in the past 12 years routinely used force-push for PR iterations.
Turns out, when writing production code, other people rarely checkout and work simultaneously on branches of half-finished stuff of other people. It is and should be very, very rare. Very occasionally it happens that someone bases their work on the branch another developer. In these cases, people just carefully rebase their branch on "origin/other-branch" after a fetch. You can't rely on people not force push anyway. Even if you agreed on it, sometimes this needs to be done. (e.g. commit a very large binary file by accident). So you need work in a way which assumes that somebody might have force-pushed their private branch.
Multiple people working on the same branch without a PR process is indeed messy and you should never force push when you do that. They key here is to avoid working with multiple people on the same branch in the first place. I've seen this happening only when: - Work items are too big and not broken down enough (branches are actively developed for for several weeks/months ). Usually and indication of lack of architecture and product leadership. If you do this, you have lots of other interesting problems as well. You are prototyping really but pretending you don't. - You are consciously experimenting and prototyping. Make whatever mess you want - in code and history. You are going to iterate so much and so messy that whatever you produce can't a product. Figure out what you want and need to do and start with a clean implementation afterwards. And maybe delete that messy branch eventually.
So, we have two modes: - Prototyping - which means you are allowed to make a mess because you throw it away anyway. No one, including you, cares much what you code and your repo history looks like. - Production - you write code and repo history for eternity. You do it right for the sake of everybody's sanity.
You would not allow those commits. Code review improvements should appear as fixup commits which should be autosquashed on merge. It is a shame that GitHub does not support autosquash though.
People are supposed to rebase all that noise away. Changes are supposed to be structured as sensible chunks that build up to the desired feature. It's like showing your work in a math exercise: you don't write out the final answer with no explanation, you demonstrate step by step how you reached it.
Here's my approach, of course from experience limited to my (past) workplace. We have the usual CI setup, where each merged PR triggers a build followed by a deploy to staging.
This means that what goes in a PR is decided by what sub-functionality of the feature at hand has to be tested first[0], whereas what goes in commits is decided by what is easy to read for reviewers for PRs where such an approach makes sense [1], or it simply doesn't matter much like you said, for a lot of other cases.
That is the way I like to think about it.
I know I know git bisect etc... but IME in the rare cases we used it we just ran bisect on the master branch which had squashed PR level commits, and once we found the PR, it was fairly straightforward to manually narrow it down after that.
In more systems level projects there will actually be clear layers for different parts of your code (let's be honest, business logic apps are not structured that well, especially as time goes) and the individual-commits-should-work approach works well.
[0] ideally, the whole feature is one PR and you config-gate each behaviour to test individually, but that's not always possible.
[1] for example, let's say we have a kafka producer and consumer in the same repo. They each call a myriad of internal business logic functions, and modifications were required there too. It is much easier to read commits separating changes, even to the same function, made as part of the producer flow and the consumer flow.
For me it’s because Feature A may largely be fine but one of those intermediary commits introduced a regression. I can bisect and isolate an issue much more easily if I have the full history to step through as opposed to “this big commit intrigued a one-line regression _somewhere_ in a 900 line commit”
I often do, In a larger PR or in one where it's hard to tell what is being accomplished, like this article articulates the commits can tell a story of the engineers journey to solution. Even if I review a commit that is largely undone by future commits that piece of history is often key to my understanding.
I think it's fine to have a whole bunch of "WIP" commit messages on intermediate commits while the PR is in a draft stage, but then all of those garbage commits should really be squashed down into one commit and you should at least write a one liner that describes what the whole change is doing. I think it does materially make repo history harder to understand to merge in PR's with 10 garbage commits in them.
(With few exceptions,) I generally follow this practice; BUT, I think enforcing this on other developers feels like micromanagement. That being said, with few exceptions, committing code that doesn't compile feels like an incomplete sentence.
(Sometimes on massive refactors I make commits that don't compile. It gives me a place to roll back to. If someone thinks this is poor practice, than I think they're putting principles in place of practicality.)
A _branch_ is a unit of work that should be merged when done.
As the owner of a branch, an engineer has the ability to move into intermediate states. The larger the codebase, the larger the possibility of something unexpected breaking or not compiling. Just like editing a large body of text - you will have "incomplete sentences" through the process. It's part of writing. Expecting others to write their drafts the same way you like is just silly - it's putting rigid principles ahead of anything else that matters.
When reviewing a conglomerate commit in a PR, I have to reverse engineer how the different changes interact to figure out the intent. I then have to do this on each update they make. Contrast that to when someone breaks up their commits where I can zoom through variable renames, extracting functions, etc to see the one line that change that all of that unblocked that makes the difference. Then if updates are pushed, I only have to worry about the commits that were updated.
As for all commits compiling, that is helpful to review the individual commits.
Both of these (small commits, all compiling) are also great for bisecting. You get pointed to a very small change that you can more easily analyze vs dealing with breakages or having to analyze a large change to find what the problem is.
> - smaller PRs aren’t necessarily easier to review (and this arbitrary obsession almost always leads to PR overload in chunks that don’t make any sense, reducing code quality as a result)
Oh but they sure can be reviewed more easily, because they are shorter? Doing so feels like less effort, and you get a dopamine hit from hitting that "submit review" button faster/more often (improved morale, and PR turnaround time!). Plus, if there's a longer discussion about X, it's great if it's not tangled up with Y and Z at the same time - allowing you to dig into X.
> - nobody reads intermediate commit messages one by one on a PR, period.
Come on, that's intellectually dishonest. 1. VSCode displays commit messages inline as blame for me (and many of my colleagues), so even when we don't read the commit messages one by one _on a PR_, I often read them later in the IDE (we don't squash merge PRs). I spend significantly more time reading code than writing, and commit messages, PR descriptions and linked issues provide extra context that is useful to me especially for complex code. If those messages were entirely unreadable, I'd be annoyed. 2. When someone invests time into telling a good story commit by commit, in my team they write "Review commit-by-commit is encouraged" in the PR description, to tell the reviewers that yes, they should read the individual commits, as that'll make understanding the PR easier. Often as reviewer, I follow that suggestion.
> Wasting your time with this in a branch, as you work towards a solution, is focusing on the wrong thing
It seams you're conflating "working on a feature" with "presenting it as PR to review". That's two very different things, and Edamagit in VSCode makes it so so easy to provide a reasonable commit history that hides some of your missteps, and to fill in commit messages.
you need to be careful with every single commit message, every commit must compile, etc, in your case. My comments apply if you squash-merge, in which case all that commit-level care is not necessary since intermediate commits go away on merge. You’re probably making your life harder for no reason for avoiding squash-merge, but that’s just my opinion
In my part of the world both of these are true, and proudly so. We keep catching a myriad of errors, big and small. The history is easy to read, and helps anyone catching up with how a certain project evolved.
I understand it might not be true for everyone, every team, in every line of business; but this sort of discipline pays off in quality oboth of the code _and_ the team members' abilities.
When you have a large PR like this, here's how I like to get it reviewed.
1. Give reviewers sometime to become familiar with the PR. They might not understand all parts of it, but they should have at least a cursory understanding of the PR.
2. Have a meeting where the PR is explained in front of the group of reviewers. The reviewers will understand the PR better and they can ask questions in realtime.
3. Let folks review the PR after the meeting in case they spot anything else, or think of additional questions.
Most of the time PR review is done asynchronously, but doing most of the review in the meeting can also be a decent team building exercise.
Hopefully you've been going around and around at a high level communicating back all the problems that you've hit and the design issues that emerged during exploratory surgery.
Then, you definitely want to schedule at least one meeting to go over it. Which can become several meetings, including follow-up meetings with one or two individuals to pound out some specific issue. Depends on the complexity of the nuclear reactor.
I agree that for a common team of programmers working for a single company, the value isn't always there. But that's the easiest and least interesting case... in big distributed projects this stuff really matters.
Very common practice at my old company, and one I continue in my current role.
> “every commit must compile”
sucks ass for anyone else trying to rebase your branch onto the update main/master when they don't. Once your PR is out of "working on the feature" and into the "getting it merged" phase, do a little `git rebase -i` and squash your really intermediate commits into ones that compile. Ignore this if you have real grown up CI where your PRs never stay open for more than a day.
A vast majority of the drama that comes out of source control is associated with branches living for far too long.
I've got an internal alarm that starts to go off somewhere around 72 hours. If something takes longer than this, I've probably screwed up in my planning phase. There are some things that do need to sit, but they should be rebased every morning like clockwork. The moment things start to conflict, the PR gets closed and the branch is now a reference for how to do it again when whatever blocker is cleared.
Another way to think about all of this is to pretend like everything you are touching is taking a synchronous lock out (even if it's not), similar to how tools like Perforce behave. So, you generally want to move as quickly as possible to get out from under lock contention. Git allows you to pretend like you aren't conflicting for a really long time, but at some point you must answer for all of this debt (with interest).
Nah, in my experience, if you've got good commit hygiene you can often merge even ancient commits.
Here's a pretty hefty commit I merged five years after it was originally written, converting a ~100k line codebase from GTK to SDL2, written in 2015, committed in 2020, with tons of development in between, with "10 files changed, 777 insertions(+), 804 deletions(-)"
https://github.com/smcameron/space-nerds-in-space/commit/4ab...
I was expecting it to be a bit of a nightmare, but it really wasn't bad at all.
why would anyone else rebase your branches? YOU should rebase your branches.
This is why I gave up reading the article shortly after reaching the point about making a history with commit messages. The comments—even if it is on a Git forum—will just be full of people that either say that it’s a waste of time or that it is literally impossible for this to be practiced by anyone.[1]
Your best bet is to find projects where this is practiced (and you don’t have to look far). But making the case to a general audience? No, too many loud voices that treat version control like “I am committing now because I need to pick up the dry-cleaning” arbitrary/random snapshot-maker.
[1] No one, period? Sounds like a bit of a strict ontological rule to me.
So you're the one breaking git bisect all the time. Grrrr.
Use stgit and make decent commits instead of rolling in the dirt like an animal.
If every commit on the main branch must compile then why wouldn't it also compile in the PR branch? It doesn't make sense to ask people to review, then after that rebase and merge imo.
That isn't really where it came from though. The idea was, if I want an open source maintainer to accept my changes, I make a request to pull them from my branch. Once the open source maintainer has merged it in, they own it. If they don't like it (even one little bit), they can reject it because quality / ownership / maintenance is completely on them.
On a team environment where no one owns anything it is a little less clear what the value is. You want to incentivize the "betterness" of "something" and are using "broadened knowledge" as a proxy for that. Usually this just goes unexamined but really it would be good to establish how broad and deep you want this knowledge to be and work back from there - is the 5 minute PR review the best way to achieve it?
my understanding is that you commit when you are at the "good place", where the part of the code you are working on works. That way when you keep going and find yourself going in a direction that is not right, you can go back to the last good place. If your code doesn't even compile, that doesn't seem like a good place.
Meh, most people wont address it or ask that dollar. It does not mean I did not read it, I chuckled and moved on.
I do read every commit on PR chain and every line. I am not necessary super attentive reviewer or something, but I never accept it without at least formally looking at it.
Why advocate against this anyway? If no one reads them, it harms no one. Just like personal blogs. However, the writing of the blog is the useful act, not the reading.
Ironic that you are accusing TFA article of being an expert novice. I don't disagree your take on him / the article, but you are committing the same sin.
If the point is about forcing someone to write commit essays, then yes I did miss it.
Until you need to `git bisect`. Then you'll require that every commit compile, pass tests, etc.; even if that means rebase/squashing to do it.
Main has clean history and every commit is good.
> every commit must compile
I’m in the opposite camp. Following these two practices often doesn’t make any difference but the few times it did saved me a ton of time.
Dropping commits or rebasing is much easier when you have descriptive, atomic commits. It’s also helpful when performing git blame archeology to try and understand why this code looks so weird and has no context. It’s also useful when bisecting (not so much a problem with small PRs, quite handy as they grow bigger and bigger)
As with everything it’s about context and circumstances. As you gain expérience you can appreciate and gauge when it’s required. When you don’t have the expérience then you follow rules so that you gain said expérience. That’s how I see it.
Reducing scope and splitting a single task into multiple PRs each small but part of a bigger picture makes it very hard to see the bigger picture.
You should try to make PRs small, but if a PR is big, then you just have to spend more time to review it.
Formatting commits as a story is a huge hurdle for the one making the changes. And unless every PR is meticulously prepared - going over the commits by the reviewer is a waste of time.
I agree you should return PRs you don't understand though. Or don't feel comfortable reviewing for whatever reason.
I've heard people scoff at the $$ cost of "mob programming". I think that view is totally myopic, for appropriate problems there's just no faster nor higher bandwidth way to transfer code knowledge in a group.
Plenty of people dislike pair programming, i don't dislike it but i do find it mentally intense, tiring. I really really enjoy that it's an accelerator for getting to done - not just i wrote the code but the code is correct - sooner.
Long way to say don't rely on pull requests when you could be doing pairing for the important stuff.
Yes, everybody would love it if every PR was small enough. In reality that is not a good way to build substantial features.
Often, fully building out a substantial feature, causes you to change your mind and completely changing your approach the further along you are. You don't want to be muddying up the PR pipeline with a bunch of half-assed changes.
Doing that just makes reviewers less inclined to give good feedback on a PR, because they "know it's going to change so much anyways".
If you are building a substantial feature, it is reasonable that the PR is large and reviewers will have to dedicate substantial time to reviewing it. Reviewing it is work on its own and hopefully your engineers have dedicated time to review substantial features.
Of course, you should make sure your substantial feature is as minimal as possible, for whatever is needed to ship the feaure - but not any less than that.
PR's should generally be the size of a feature, or a meaningful subfeature for large features.
When you arbitrarily split up PR's into something "300 lines" or "5-10 minutes" you can miss the forest for the trees. The little thing looks fine in isolation but doesn't make any sense as part of a larger approach. Different people are reviewing it piecemeal but nobody is reviewing the approach as a whole or making sure the parts fit together right.
And then the idea of "telling a story with commits" feels like a waste of time to me. I have no interest in the order in which you wrote the code, or what you wrote and then rewrote. The code itself needs to be legible. Your final code with its comments should speak for itself. Code is the what and comments are the why.
Now, what I will say is that the more junior the developer, the smaller their commits should be. But that's because they should be assigned smaller features, and have more handholding along the way. And when people are making larger architectural changes, they should be getting signoff on their overall approach from the start -- if you're frequently rejecting the whole approach to a problem in code review, something's going wrong with your communications processes.
Commit #1 adds a helper function for whatever, looks innocent enough, implementation is correct. Believe it or not, it even has tests, lgtm. Then only by commit #8 do you realize this helper function is not needed at all and the entire approach is wrong. Happens every time.
I started reviewing these chains backwards and refuse starting a review until the whole chain is available. That’s however not always easy either, when commit #2-#5 has incrementally refactored everything into something unrecognizable, so that both the left and right side of the diff are wrong! No, I’m not interested in ”this will be fixed 2 commits down the chain”. I just want to review the final state that goes into production, nothing else matters.
Yes, commits should be made small whenever possible and not include unrelated fixes or refactors. Just please, keep them meaningful on their own.
Honestly I think it would be far more effective to just review a paragraph and maybe a diagram that explains "here's how I think I'm going to tackle this problem" and forget about line-by-line code review entirely. Other than for training juniors, I don't think there's much long term value in "I think you should use an anonymous function here instead of a named function".
The kinds of things that are usually brought up in code review are not what contributes to real long term technical debt, because a function name or code formatting can be changed in an instant, while larger architectural decisions cannot.
The other thing I noticed is that even when an architectural issue is obvious, there's a tendency to not want to change it because so much of the work has already been done in preparing the PR for review. If you point out a flaw in an architectural decision, it's not unusual for the person to reasonably say "I've just put together a chain of 5 PRs and now you're asking me to rewrite everything?"
There are two things that can be involved when a change is submitted: a design change and an implementation change.
Design changes that are significant enough to warrant review should be expressed and discussed in text, not code, in design documents of some kind. These tend to be quite general, but they establish agreement about general structure, which tends to be the most difficult to change later on, as it circumscribes and guides the implementation that follows. Writing is also a way of working out ideas. The very act of having to explain something clearly to someone else forces you to confront your own ignorance and the consequences of your proposal. Besides, it's the job of the person proposing the design to work out those consequences so that others can verify whether they're true. (1)
Reviewing implementation changes with an understanding of design allows the reviewer to understand how changes relate to the whole as well as the aim. This is an insider's perspective. (2)
Reviewing implementation changes without a good understanding of the context will be limited to general technical remarks, but can descend into excess attention given to style or inconsequential matters of taste. This is the outsider's view. (3)
The question, I think, that looms in the background is whether familiarizing yourself with the context sufficiently well so that you can judge the PR submission is reasonable for a PR. In many cases, it isn't. It's too time consuming and the context is too big. If we had infinite time, this would be great: having to explain to an outsider what you've done forces you to give a much more thorough account, if your goal is to achieve thorough understanding. It also exposes your thinking to someone who doesn't necessarily share your assumptions. But this can be a Herculean task for anything sufficiently complex. So the criticality of the change must be weighed against the effort needed to explain or learn. Are you making a change to something with high tolerance for error, or a small margin of error?
Two pieces of advice...
Since it is unrealistic to expect an exhaustive verification all the time, focusing more on tests will be more fruitful. You still need context to judge whether they're exhaustive or test the right things, but it's the one place where correctness criteria are expressed in code, apart from type signatures, in a clear enough manner that expectations can be judged. If they aren't clear, you should ask. It's a good locus for discussion.
The second: code doesn't include the rationale or "why" for what it is. It just is. Context goes a long way to help infer reason for the change. This means we should use comments, either in the code or the PR submission itself, to explain changes. If something isn't sufficiently clear, ask.
But the key is prudential judgement. You have to determine how to limit your verification efforts and when to begin accepting on trust for practical reasons.
And do away with your pride. It's only difficult to ask questions if you suffer from pride, and pride is a sure sign of mediocrity. You're also lying through omission.
Features are everything. Don't give me that bullshit about refactoring and generifying and DRY and TDD.
Yea, this person knows x so blindly approve.
or
Hey jane/john did you remember to check x before this? Yes! Ok, blindly approve.
PRs have no scope and no way to designate scope. What do I want from each reviewer? At some point, the PR becomes superfluous to adequate code review and communication around said review.
Ultimately they can be used to micromanage and gatekeep merges to main. This is the PR at its worst.
Overall I’m not a big fan, it feels like a necessary evil to meet the demands of Big Agile.
1. Why do you want to fragment context? Why not add this to the main file with inline docs?
If you have a good description, then you understand (a) what parts are important to think about and (b) do you agree with the approach and (c) is there anything in the code that doesn't line up with the approach.
This strategy scales up pretty well to even large PRs.
If you don’t understand the change, it’s too large, it contains multiple independent changes that should’ve been separate PRs, anything that doesn’t smell right - send it back.
My expectation when you review my PR is that your ass is on the line just as much as mine if something goes wrong.
PRs aren’t a checkmark exercise that validates you’re not trying to backdoor an exploit into the system. A reviewer that accepts a change is committing themselves to maintain said change going forward.
If you let your kids get a dog and the kids don’t take care of it, you will. There’s no two ways about it.
The storytelling approach through commits is brilliant, but it only works if you solve the human factors too. Even perfectly crafted PRs with great commit narratives get surface-level reviews. The friction kills engagement.
A few complementary approaches I've seen work: pair reviewing for complex changes that are hard to break down, AI pre-screening for basic issues so humans can focus on architecture/business logic, and synchronous review sessions when async back-and-forth is just burning time.
The key insight: good PR structure needs to be paired with removing tooling friction. When review is painful, people default to "LGTM" regardless of how well the story is told.
At first it was a mystery why some other dev reviewing my PR wanted me to split a commit into separate parts. It's also not something you learn in university and definitely requires some git knowledge to do properly.
But after doing it once or twice i definitely understood, that it was also helping *me and forced me into better version control practices. Definitely a good lesson to learn, but more valuable to me was that the job taught me, that I absolutely despise doing software development.
I enjoy LLM vibe coding now because I know approximately what code to expect for any given prompt. Basically if the agent doesn't give me code I expect, I usually reject it. It's rare when it comes up with a solution I didn't expect though. I think this is because reading other people's code trains you to be minimalist because it's easier to spot unnecessary complexity when it's from someone else.
I think the skill of being able to read code quickly and applying your intuition is going to be increasingly valuable.
Nowadays, I don't even need to read the whole code to sense when there are issues. I usually have a 'gut feeling' when the code has problems by glancing over it; though of course, I need to invest some effort to list out specific issues because I can't say "this doesn't feel right" in my code review. But even with this, you can become better. Developers who write a certain way tend to make the same kinds of mistakes.
What each color means would depend on the PR, but, for instance, yellow = refactoring, brown = test code, blue = drive-by fix, orange = more efficient data structure etc.
The colors and their meanings could be set by either the author or the reviewers. It would be similar to the file checkboxes that exist today, but in this case, it would be per concept, not per file.
How it's going: https://gitmoji.dev/
IMHO, the novelty wears out fast. Especially when your git history starts looking like a Messages thread.
Funny thing I never seen a good PR with meaningful multiple commits. At least you need to be quite proactive in history rewrite to make it presentable. Usually it's mere fix/fix/fix looking amazingly bad without squash.
If you want to comment on this matter, you are welcome to share your github examples.
I can see that it's good to be able to see that cleaned-up story, but I don't think the commit history should be that. There should be something else, a meta-statement about a bunch of commits, that summarizes it. But the commit history is still there if people later need to dig into how something was done. I saw some other comments talking about PRs as this unit, which maybe works, but I've always thought it would make more sense if the VCS had a notion of "commit sets" to which metadata could be attached. Then you could either look at such a set in collapsed form, with a single summary description of the whole set, or you could expand it out to look at the individual commits. In theory these could even be nested, although that might get messy and I think even just two levels would be very useful.
Edit to add: By fighting complexity, you're demanding more and you're right back to looking smart again. So actually your precious ego can still save face.
Quickly learned the best code reviews are focused on logic and architecture, not formatting. But it's easy to slip into nitpicking because those comments are easier to write
kaapipo•4mo ago
keriati1•4mo ago
I still encourage do to a lot of small commits with good commit messages, but don't submit more then 2-3 or 4 commits in a single PR...
epage•4mo ago
kiitos•4mo ago