frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Agentic AI Code Review: From Confidently Wrong to Evidence-Based

https://platformtoolsmith.com/blog/agentic-ai-code-review/
2•sharp-dev•2h ago

Comments

sharp-dev•2h ago
TL;DR: AI code reviewer went from "confidently wrong" to actually useful. Fix: stopped pre-selecting context, gave the model tools to fetch evidence itself. Now it either cites file:line or stays quiet.

The Problem

Our AI reviewer flagged a "blocker." Cited the diff, built a plausible argument, suggested a fix. The senior engineer spent 20 minutes disproving it. Did the guard clause get missed? Two files away. The model never had that file, so it guessed and sounded certain. Pre-selecting context doesn't work. Code review follows evidence chains, and chains aren't predictable.

The Fix

Agentic loop:

Model: "This calls validate()" → search_code("validate") Model: "Two call sites use withRetry(). Third doesn't." → get_file_content("config/defaults.go") Model: "Missing timeout. Bug found." → submit_code_review(structured_output) Model fetches what it needs. Loop ends when it submits structured findings (path, line, severity, evidence—not prose).

What Changed

- Before: "This might break retries." - After: "In foo/bar.go:123, call bypasses withRetry(). Other call sites use it (see search results). Wrap or document."

The Pieces

1. Tools — boring, fast, deterministic. get_file_content, search_code. Treat them like production APIs. 2. Terminal action — structured JSON submission, not Markdown. No evidence? Can't submit. 3. Loop — model turn → tool turn → repeat. Aggressive context shrinking (old results truncated, diff stays). 4. Guardrails — iteration caps, timeouts, self-critique checklist.

Evaluation

Pick 5-10 PRs where you know the real risks. Check: - Found the issue? - Cited exact file:line? - Hallucinated anything? - Fetched evidence when uncertain? Iterate on tools, not prompts.

The Pattern

Don't build bigger prompts. Build a loop where the model can fetch evidence, test hypotheses, and submit only when it can cite sources. That's the difference between "sounds right" and "is right."

Yakuza creator's new game in doubt as NetEase pulls funding

https://www.polygon.com/gang-of-dragon-toshihiro-nagoshi-studio-netease/
1•sagacity•1m ago•0 comments

Freestiler – PMTiles vector tilesets from R and Python

https://walker-data.com/freestiler/
1•carnevalem•2m ago•0 comments

How not to test LLM models

https://theartificialq.github.io/2026/03/08/how-not-to-test-llm-models.html
1•HonzaT•4m ago•0 comments

Utilization metrics across accelerators (GPUs, TPUs, and so on)

https://github.com/gpusprint/gpusprint
1•heyjupiter•5m ago•0 comments

Behavioral Effects of High Peak Power Microwave Pulses (1992) [pdf]

https://apps.dtic.mil/sti/tr/pdf/ADA258136.pdf
2•anonu•6m ago•0 comments

Microsoft Outlook app now showing paid spam/phishing ad's

https://imgur.com/a/O9bjjQQ
1•xvxvx•7m ago•1 comments

Show HN: PDF to JPG converter that runs in the browser (no uploads)

https://privatepdftojpg.com/
1•touchsomegrass•8m ago•0 comments

Show HN: ClarifyDoc – explains contracts in plain English

https://clarifydoc.xyz/
1•tgdaimov•10m ago•0 comments

Small web publishing tools and frameworks

https://codeberg.org/thgie/awesome-small-web-publishing
2•smartmic•10m ago•0 comments

Self-hosted docs platform – 4 PHP files, no database, free GitBook alternative

https://github.com/webstudio-ltd/docs
3•webstudioltd•10m ago•4 comments

Ask HN: What should an international dev do today?

2•jzu•11m ago•1 comments

AI Agent Site Score Scanner

https://prodlint.com/score
1•AMARCOVECCHIO99•12m ago•0 comments

Can the mental health benefits of exercise be bottled?

https://medicalxpress.com/news/2026-02-mental-health-benefits-bottled.html
1•PaulHoule•12m ago•0 comments

Coasts: Localhost service isolation and orchestration for Git worktrees

https://github.com/coast-guard/coasts
1•handfuloflight•14m ago•0 comments

China's AI progress by the numbers: GLM-5 benchmarks, robotaxi, and Huawei chips

https://medium.com/ai-advances/china-winning-ai-race-deepseek-nvidia-ca7de8a727ec
1•Aedelon•14m ago•0 comments

Show HN: VectorLens – See why your RAG hallucinates, no config

1•gustav-proxi•14m ago•0 comments

Agentic Debt

https://neilkakkar.com/agentic-debt.html
2•neilkakkar•14m ago•0 comments

Show HN: Dashboard for monitoring multiple Claude Code sessions

https://github.com/Stargx/claude-code-dashboard
1•Stargx•16m ago•1 comments

Neuroscientists have pinpointed a potential biological signature for psychopathy

https://www.psypost.org/neuroscientists-have-pinpointed-a-potential-biological-signature-for-psyc...
2•amichail•18m ago•0 comments

60 Minutes Havana Syndrome report finds U.S. government tested energy weapon

https://www.cbsnews.com/news/60-minutes-havana-syndrome-report-finds-u-s-government-tested-energy...
6•jonas21•21m ago•1 comments

Flexible feline spines shed light on "falling cat" problem

https://arstechnica.com/science/2026/03/tuck-and-turn-or-bend-and-twist-how-falling-cats-land-on-...
2•Tomte•21m ago•0 comments

Iran Transformed

https://www.nybooks.com/online/2026/03/08/iran-transformed/
1•mitchbob•25m ago•1 comments

Agent Skill to Use a Debugger

https://github.com/AlmogBaku/debug-skill
1•talolard•25m ago•1 comments

EU publishers won a piece of a shrinking pie

https://mediaindustryshift.substack.com/p/eu-publishers-won-a-piece-of-a-shrinking
3•taubek•26m ago•0 comments

Fukushima at 15: Living with radioactive hot spots and stigma

https://thebulletin.org/2026/03/fukushima-at-15-living-with-radioactive-hot-spots-and-stigma/
2•CqtGLRGcukpy•27m ago•0 comments

Show HN: ChopChopGo – Sigma-based threat hunting for Linux forensic artifacts

https://github.com/M00NLIG7/ChopChopGo
1•M00NL1G7•27m ago•1 comments

Animator Pro (Autodesk Animator) Source Code

https://github.com/AnimatorPro/Animator-Pro-C
1•reconnecting•28m ago•1 comments

We strongly oppose the Unified Attestation initiative

https://xcancel.com/i/status/2031041385554386960
5•ledoge•28m ago•2 comments

Oscar Pool Ballot, 98th Academy Awards

http://fxrant.blogspot.com/2026/03/oscar-pool-ballot-98th-academy-awards.html
1•speckx•30m ago•0 comments

Advanced Pet Screen Drawing Techniques

https://retrogamecoders.com/advanced-pet-screen-drawing-techniques/
1•ibobev•30m ago•0 comments