frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Are LLMs better suited for PR reviews than full codebases?

3•aaa_2006•2h ago
Semgrep recently published an analysis of how LLMs perform at spotting vulnerabilities in code: https://semgrep.dev/blog/2025/finding-vulnerabilities-in-modern-web-apps-using-claude-code-and-openai-codex/

I’ve been thinking about this problem and wanted to share a perspective.

When evaluating LLMs for static analysis, I see four main dimensions: accuracy, coverage, context size, and cost.

On accuracy and coverage, today’s LLMs feel nowhere close to replacing dedicated SAST tools on real-world codebases. They do better on isolated snippets or smaller repos, but once you introduce deep dependency chains, results drop off quickly.

Context size is another bottleneck. Feeding an LLM a repo with millions of lines creates huge problems for reasoning across files, and the runtime gets impractical.

That leads to cost. Running an LLM across a massive codebase can be significantly more expensive than traditional scanners, without obvious ROI.

Where they do shine is at smaller scales — reviewing PRs, surfacing potential issues in context, or even suggesting precise fixes when the input is well-scoped. That seems like the most practical application right now. Whether providers will invest in solving the big scaling problems is still an open question.

Curious how others here think about the trade-offs between LLM-based approaches and existing SAST tools.

Comments

aafanah•2h ago
Interesting. LLMs are already shining at PR reviews even if they struggle with massive codebases right now. And they are evolving fast enough that those scaling limits might not stay limits much longer.
kogatlas•2h ago
I'd love to see your evidence that "LLMs are already shining at PR reviews". We've used a handful of them here where I work for months now and they are rarely correct, and thus, rarely useful. Instead they tend to just summarize nonsense that wasn't even introduced in that PR, make shit up entirely, or recommend bad fixes to things that would be better solved by being removed entirely.
aafanah•2h ago
Fair point. I think the bottom line is that it depends a lot on the context and how the prompt is framed. For PRs with small enough scope, I have seen LLMs provide decent value, mostly in surfacing potential issues or offering quick summaries. That said, the Semgrep analysis highlights that accuracy and coverage still fall short even in these narrow cases, so clearly there is still a lot of work to be done before this becomes broadly reliable.

Making 8-bit Music From Scratch at the Commodore 64 BASIC Prompt [video]

https://www.youtube.com/watch?v=ly5BhGOt2vE
1•johlo•32s ago•0 comments

Bringing restartable sequences out of the niche

https://lwn.net/Articles/1033955/
1•Bogdanp•2m ago•0 comments

South Koreans detained in ICE raid at Hyundai plant

https://www.bbc.com/news/articles/cj6xe5d6103o
1•belter•3m ago•0 comments

Cormac McCarthy's Library

https://www.smithsonianmag.com/arts-culture/two-years-cormac-mccarthys-death-rare-access-to-perso...
1•hackandthink•9m ago•0 comments

A CSS-only time progress bar to use in Markdown / GitHub Pages

https://christianheilmann.com/2025/09/05/a-css-only-time-progress-bar-to-use-in-markdown-github-p...
1•bobbiechen•9m ago•0 comments

Steven Soderbergh Goes Rogue (Again)

https://www.hollywoodreporter.com/movies/movie-features/steven-soderbergh-interview-the-christoph...
1•homarp•10m ago•0 comments

Jonathan's Space Report

https://planet4589.org/index.html
2•kqbx•10m ago•0 comments

Boys vs. Women: Male High School Athletes vs. Female Olympians

https://boysvswomen.com/?2016
2•docdeek•14m ago•0 comments

Capitalization of Initialisms

https://www.teamten.com/lawrence/writings/capitalization_of_initialisms.html
1•praptak•18m ago•0 comments

32-Bit Linux Support Now and in the Future – Arnd Bergmann, Linaro [video]

https://www.youtube.com/watch?v=QiOMiyGCoTw
1•MaximilianEmel•19m ago•0 comments

Healthcare Sector Takes 58 Days to Resolve Serious Vulnerabilities

https://www.infosecurity-magazine.com/news/healthcare-58-days-resolve-serious/
2•speckx•22m ago•0 comments

Star Wars: Darth Vader's lightsaber sells for £2.7M at LA auction

https://www.bbc.co.uk/news/articles/cy4rdywp34vo
3•lifeisstillgood•23m ago•2 comments

Anthropic to Pay $1.5B to Settle Book Piracy Class Action Lawsuit

https://variety.com/2025/digital/news/anthropic-class-action-settlement-billion-1236509571/
2•homarp•23m ago•1 comments

Synctera introduces end-to-end solution for building banking and credit cards

https://synctera.com/post/loanpro-partnership-credit-cards
1•thatdrew•24m ago•0 comments

Default musl allocator considered harmful to performance

https://nickb.dev/blog/default-musl-allocator-considered-harmful-to-performance/
2•fanf2•27m ago•0 comments

AI video models cannot make the character snap their fingers

https://twitter.com/venturetwins/status/1964026187707035765
1•amrrs•28m ago•0 comments

Powering PyPI with Advanced Traffic Engineering

https://www.fastly.com/blog/powering-pypi-with-advanced-traffic-engineering
1•logicalstack•29m ago•0 comments

HHS warns US health care industry to share data with patients or else

https://www.theregister.com/2025/09/05/hhs_to_penalize_firms_wont_share_health_data/
2•rntn•32m ago•2 comments

Why did brides throw cake in a sock at old Irish weddings?

https://www.rte.ie/lifestyle/living/2025/0828/1530592-why-did-brides-throw-cake-in-a-sock-at-old-...
2•austinallegro•33m ago•0 comments

Claude Investigating Issues with Opus 4.1

https://status.anthropic.com
1•redm•34m ago•1 comments

When LLMs Grow Hands and Feet, How to Design Our Agentic RL Systems?

https://amberljc.github.io/blog/2025-09-05-agentic-rl-systems.html
2•amberjcjj•36m ago•1 comments

Giorgio Armani Transformed Tailoring

https://www.gq.com/story/how-giorgio-armani-transformed-tailoring
2•_tk_•37m ago•0 comments

Show HN: OpsiMate – From on-call chaos to a single control panel

https://github.com/OpsiMate/OpsiMate
2•ghsiku•43m ago•1 comments

Show HN: Desk-and-Bedside Glucose Monitor

https://github.com/giovantenne/CG2-T1D
1•zener79•44m ago•0 comments

California AG to OpenAI: Harm to Children Will Not Be Tolerated

https://oag.ca.gov/news/press-releases/attorney-general-bonta-openai-harm-children-will-not-be-to...
5•thoughtpeddler•44m ago•0 comments

Anthropic to pay $3k per book in AI copyright settlement

https://www.axios.com/2025/09/05/anthropic-ai-copyright-settlement
14•puttycat•50m ago•0 comments

Show HN: Stroboscopic Instrument Tuner

https://github.com/dsego/strobe-tuner
2•dsego•50m ago•0 comments

Great attorney (Mom and Pop shop) for incorporating Delaware C-corp

1•michakhidenor•51m ago•0 comments

The Wotancraft Rider V2 Photography Sling Is Inspired by Cycling Bags

https://petapixel.com/2025/08/21/the-wotancraft-rider-v2-photography-sling-is-inspired-by-cycling...
1•PaulHoule•52m ago•0 comments

Introducing Speed Brain: helping web pages load 45% faster

https://blog.cloudflare.com/introducing-speed-brain/
1•Velocifyer•53m ago•0 comments