frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Get an AI code review in 10 seconds

https://oldmanrahul.com/2025/12/19/ai-code-review-trick/
30•oldmanrahul•3h ago

Comments

Smaug123•1h ago
With not much more effort you can get a much better review by additionally concatenating the touched files and sending them as context along with the diff. It was the work of about five minutes to make the scaffolding of a very basic bot that does this, and then somewhat more time iterating on the prompt. By the way, I find it's seriously worth sucking up the extra ~four minutes of delay and going up to GPT-5 high rather than using a dumber model; I suspect xhigh is worth the ~5x additional bump in runtime on top of high, but at that point you have to start rearchitecting your workflows around it and I haven't solved that problem yet.

(That's if you don't want to go full Codex and have an agent play around with the PR. Personally I find that GPT-5.2 xhigh is incredibly good at analysing diffs-plus-context without tools.)

verdverm•1h ago
I've been using gemini-3-flash the last few days and it is quite good, I'm not sure you need the biggest models anymore. I have only switched to pro once or twice the last few days

Here are the commits, the tasks were not trivial

https://github.com/hofstadter-io/hof/commits/_next/

Social posts and pretty pictures as I work on my custom copilot replacement

https://bsky.app/profile/verdverm.com

Smaug123•1h ago
Depends what you mean by "need", of course, but in my experience the curves aren't bending yet; better model still means better-quality review (although GPT-5.0 high was still a reasonably competent reviewer)!
pawelduda•40m ago
Yes, it's my new daily driver for light coding and the rest. Also great at object recognition and image gen
ocharles•1h ago
I recently started using LLMs to review my code before asking for a more formal review from colleagues. It's actually been surprisingly useful - why waste my colleagues time with small obvious things? But it's also gone much further than that sometimes with deeper reviews points. Even when I don't agree with them it's great having that little bit more food for thought - if anything it helps seed the review
danlamanna•1h ago
Are you using a particularly well crafted prompt or just something off the cuff?
sultson•1h ago
This one's served fairly well: "Review this diff - detect top 10 problem-causers, highlight 3 worst - I'm talking bugs with editing,saving etc. (not type errors or other minor aspects) [your diff]". The bit on "editing, saving" would vary based on goal of diff.
morkalork•1h ago
Not who you're replying to but working at a small small company, I didn't have anyone to give my code for review to so have used AI to fill in that gap. I usually go with a specific then general pass, where for example if I'm making heavy use of async logic, I'll ask the LLM to pay particular attention to pitfalls that can arise with it.
ocharles•1h ago
We're a Haskell shop, so I usually just say "review the current commit. You're an experienced Haskell programmer and you value readable and obvious code" (because that it is indeed what we value on the team). I'll often ask it to explicitly consider testing, too
eterm•28m ago
Personally, this is what I use in claude code:

"Diff to master and review the changes. Branch designed to address <problem statement>. Write output to d:\claudeOut in typst (.typ) format."

It'll do the diffs and search both branch and master versions of files.

I prefer reading PDFs than markdown, but it'll default to markdown unprompted if you prefer.

I have almost all my workspaces configured with /add-dir to add d:/claudeOut and d:/claudeIn as general scratch folders for temporary in/out file permissions so it can read/write outside the context of the workspace for things like this.

You might get better results using a better crafted prompt (or code review skill?). In general I find claude code reviews are:

  - Overly fussy about null checking everything
  - Completely miss on whether the PR has properly distilled the problem down to its essence
  - Are good at catching spelling mistakes
  - Like to pretend they know if something is well architectured, but doesn't
So it's a bit of a mixed bag, I find it focuses on trivia but it's still useful as a first pass before letting your teammates have to catch that same trivia.

It will absolutely assume too much from naming, so it's kind of a good spot if it's making wrong kind of assumptions about how parts work, to think how to name things more clearly.

e.g. If you write a class called "AddingFactory", it'll go around assuming that's what it does, even if the core of it returns (a, b) -> a*b.

You have to then work hard to get it to properly examine the file and convince itself that it is actually a multiplier.

Obviously real-world examples are more subtle than that, but if you're finding yourself arguing with it, it's worth sometimes considering whether you should rename things.

ohans•51m ago
TIL: you could add a ".diff" to a PR URL. Thanks!

As for PR reviews, assuming you've got linting and static analysis out the way, you'd need to enter a sufficiently reasonable prompt to truly catch problems or surface reviews that match your standard and not generic AI comments.

My company uses some automatic AI PR review bots, and they annoy me more than they help. Lots of useless comments

hrpnk•45m ago
`gh pr diff num` is an alternative if you have the repo checked out. One can then pipe the output to one's favorite llm CLI and create a shell alias with a default review prompt.

> My company uses some automatic AI PR review bots, and they annoy me more than they help. Lots of useless comments

One way to make them more useful is to ask to list the topN problems found in the change set.

visarga•34m ago
I would just put a PR_REVIEW.md file in the repo an have a CI agent run it on the diff/repo and decide pass or reject. In this file there are rules the code must be evaluated against. It could be project level policy, you just put your constraints you cannot check by code testing. Of course any constraint that can be a code test, better be a code test.

My experience is you can trust any code that is well tested, human or AI generated. And you cannot trust any code that is not well tested (what I call "vibe tested"). But some constraints need to be in natural language, and for that you need a LLM to review the PRs. This combination of code tests and LLM review should be able to ensure reliable AI coding. If it does not, iterate on your PR rules and on tests.

MYEUHD•7m ago
> TIL: you could add a ".diff" to a PR URL. Thanks!

You can also append ".patch" and get a more useful output

petesergeant•48m ago
I have been using Codex as a code review step and it has been magnificent, truly. I don’t like how it writes code, but as a second line of defence I’m getting better code reviews out of it than I’ve ever had from a human.
zedascouves•47m ago
Hum? I just tell claude to review pr #123 and it uses 'gh' to do everything, including responding to human comments! Feedback from coleagues has been awesome.

We are sooo gonna get replaced soon...

mehdibl•21m ago
How to do agentic workflow like 2 years ago.
elliottkember•11m ago
https://cursor.com/bugbot

I didn't see this mentioned, but we've been running bugbot for a while now and it's very good. It catches so many subtle bugs.

Logging Sucks

https://loggingsucks.com/
249•FlorinSays•2h ago•81 comments

Show HN: Books mentioned on Hacker News in 2025

https://hackernews-readings-613604506318.us-west1.run.app
220•seinvak•4h ago•95 comments

Weight loss jabs: What happens when you stop taking them

https://www.bbc.com/news/articles/cn98pdpyjz5o
25•neom•42m ago•11 comments

Mullvad VPN: "This is a Chat Control 3.0 attempt."

https://mastodon.online/@mullvadnet/115742530333573065
200•janandonly•2h ago•59 comments

E.W.Dijkstra Archive

https://www.cs.utexas.edu/~EWD/welcome.html
77•surprisetalk•5h ago•7 comments

Show HN: WalletWallet – create Apple passes from anything

https://walletwallet.alen.ro/
175•alentodorov•4h ago•61 comments

ARIN Public Incident Report – 4.10 Misissuance Error

https://www.arin.net/announcements/20251212/
115•immibis•5h ago•25 comments

Get an AI code review in 10 seconds

https://oldmanrahul.com/2025/12/19/ai-code-review-trick/
31•oldmanrahul•3h ago•18 comments

You're Not Burnt Out. You're Existentially Starving

https://neilthanedar.com/youre-not-burnt-out-youre-existentially-starving/
86•thanedar•2h ago•79 comments

I Program on the Subway

https://www.scd31.com/posts/programming-on-the-subway
100•evankhoury•4d ago•66 comments

I can't upgrade to Windows 11, now leave me alone

https://idiallo.com/byte-size/cant-update-to-windows-11-leave-me-alone
96•firefoxd•1h ago•71 comments

Coarse Is Better

https://borretti.me/article/coarse-is-better
149•_dain_•7h ago•79 comments

Three Ways to Solve Problems

https://andreasfragner.com/writing/three-ways-to-solve-problems
77•42point2•6h ago•17 comments

Ruby website redesigned

https://www.ruby-lang.org/en/
304•psxuaw•13h ago•119 comments

Structured Outputs Create False Confidence

https://boundaryml.com/blog/structured-outputs-create-false-confidence
82•gmays•5h ago•48 comments

Indoor tanning makes youthful skin much older on a genetic level

https://www.ucsf.edu/news/2025/12/431206/indoor-tanning-makes-youthful-skin-much-older-genetic-level
190•SanjayMehta•15h ago•140 comments

Show HN: RenderCV – Open-source CV/resume generator, YAML to PDF

https://github.com/rendercv/rendercv
52•sinaatalay•7h ago•28 comments

FWS – pip-installable embedded process supervisor with PTY/pipe/dtach back ends

8•mrsurge•3d ago•1 comments

Measuring AI Ability to Complete Long Tasks

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
211•spicypete•16h ago•163 comments

What I learned about deploying AV1 from two deployers

https://streaminglearningcenter.com/articles/what-i-learned-about-deploying-av1-from-two-deployer...
31•breve•5d ago•20 comments

Show HN: Jmail – Google Suite for Epstein files

https://www.jmail.world
1312•lukeigel•23h ago•305 comments

Show HN: HN Sentiment API – I ranked tech CEOs by how much you hate them

https://docs.hnpulse.com
23•kingofsunnyvale•5h ago•5 comments

Show HN: Shittp – Volatile Dotfiles over SSH

https://github.com/FOBshippingpoint/shittp
104•sdovan1•8h ago•57 comments

Decompiling the New C# 14 field Keyword

https://blog.ivankahl.com/decompiling-the-new-csharp-14-field-keyword/
62•ivankahl•4d ago•24 comments

Show HN: AI-Augmented Memory for Groups

https://www.largemem.com/
7•vishal-ds•5d ago•2 comments

Show HN: The Official National Train Map Sucked, So I Made My Own

https://www.bdzmap.com/
63•Pavlinbg•8h ago•15 comments

Claude in Chrome

https://claude.com/chrome
293•ianrahman•23h ago•160 comments

ELF Crimes: Program Interpreter Fun

https://nytpu.com/gemlog/2025-12-21
44•nytpu•4h ago•9 comments

Ireland’s Diarmuid Early wins world Microsoft Excel title

https://www.bbc.com/news/articles/cj4qzgvxxgvo
302•1659447091•1d ago•116 comments

Autoland Saves King Air, Everyone Reported Safe

https://avbrief.com/autoland-saves-king-air-everyone-reported-safe/
11•bradleybuda•3h ago•4 comments