frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: We beat Google, Cognition, Claude Code at codebase docs generation

https://prode.ai/blogs/we-benchmarked-ai-code-documentation-tools-prode-scored-highest
2•curious_nile•2h ago

Comments

curious_nile•2h ago
I'm Nilesh. My brother Abhishek and I built ProdE. Carnegie Mellon and IIT Delhi.

We benchmarked four AI code documentation tools: ProdE, DeepWiki, Claude Code, and Google Code Wiki. ProdE scored highest on usefulness for coding agents. 15% ahead of DeepWiki, 38% ahead of Google, 40% ahead of Claude Code.

I know this might feel like self praise, but we couldn't find an existing benchmark to use, so created one ourselves and open sourced it.

The biggest gap is coverage. Coding agents can only answer questions about parts of the codebase that are documented. If your docs cover routing but skip middleware, every middleware question becomes a hallucination. ProdE documents 114-140 files per project. Claude Code covers 13-17. So agents using Claude Code's docs are blind to roughly 90% of the codebase.

Zero hallucinations across all 9 evaluations. Every file path, function reference, and claim we checked pointed to real code. So it's not just that we cover more, what we cover is also accurate.

DeepWiki did really well here -- 5x more diagrams per project than us, best visual docs by far. Claude Code had the strongest writing quality of the four.

Honestly, if I saw this post I'd also assume the vendor rigged it. So here's everything we did to make it not that. Claude Opus judges all four tools using a published rubric. Claude Code's output was renamed to doc_x/ so the judge couldn't tell it was Claude Code. ProdE launched after Claude's training cutoff, so the judge had no prior knowledge of our tool. We don't use Claude anywhere in our pipeline. 9 evaluation passes across 3 open-source repos (FastAPI, Pydantic, Mermaid), all pinned to exact commits to tackle the non deterministic outputs.

We scored usefulness for coding agents and readability for humans as separate things, because what makes docs good for agents is different from what makes them good for humans. Agents need lots of references to specific files and functions. Humans need clear writing and good diagrams. The tool with the best writing scored lowest on usefulness for agents. Ofcourse the usefulness for Humans is better judged by humans.

Blog (full analysis): https://prode.ai/blogs/we-benchmarked-ai-code-documentation-... Repo (everything, run it yourself): https://github.com/abhishek-curiousboxai/code-documentation-... You can fork it and re-run. Everything is MIT licensed.

govarun•33m ago
this is amazing. Curious what you guys did different

2× – nine months later: We did it

https://ideas.fin.ai/p/2x-nine-months-later
2•xfax•1m ago•0 comments

Turn Your Codebase into a Podcast

https://code2cast.com/
1•itswillbrazil•1m ago•0 comments

Our Long Love Affair with Gold

https://www.wsj.com/finance/investing/gold-bullion-market-trading-4456cbde
1•thm•5m ago•0 comments

Two inmates at an Ohio prison built a secret hacking operation from behind bars [pdf]

https://dam.assets.ohio.gov/image/upload/watchdog.ohio.gov/Investigations/2017/2015-CA00043.pdf
1•Anon84•5m ago•0 comments

Show HN: Launchy – A Next.js template for weekly launch directories

https://launchy.tools/template
1•drdruide•6m ago•0 comments

Graupel

https://en.wikipedia.org/wiki/Graupel
1•surprisetalk•8m ago•0 comments

Playdate for Education

https://play.date/education/
1•owlmusic•8m ago•0 comments

Show HN: Compiler outputs HTML for code display

https://denismarkelov.codeberg.page/crates/
1•denismarkelov•8m ago•0 comments

The Quantity Trap: The Dangerous Disconnect Between AI Supply and User Demand

https://www.lupath.ai/
1•LUpath•8m ago•0 comments

The Big Reveal in China's New Five-Year Plan

https://heatmap.news/podcast/shift-key-s3e37-china-five-year-plan
1•leonidasrup•10m ago•1 comments

Android CLI: Build Android apps 3x faster using any agent

https://android-developers.googleblog.com/2026/04/build-android-apps-3x-faster-using-any-agent.html
2•ingve•12m ago•0 comments

Show HN: Online Sound Decibel Meter

https://soundmeterx.com/
1•artiomyak•13m ago•0 comments

Thinking about building agents for humans

https://frontierai.substack.com/p/build-agents-for-humans
2•tajshaik24•13m ago•0 comments

Zipper: the archival utility for macOS you didn't know you needed

1•krishshah5•14m ago•1 comments

Ask HN: How do you maintain flow when vibe coding?

2•fny•14m ago•0 comments

What's the point of the App Store, if it can't protect users?

https://www.macworld.com/article/3115356/whats-the-point-of-the-app-store-if-it-cant-protect-user...
7•cdrnsf•17m ago•0 comments

Ask HN: To open-source, or not to open-source

1•tracker1•17m ago•0 comments

openDoJa — full reimplementation of DoCoMo's DoJa SDK in modern Java

https://github.com/GrenderG/openDoJa
1•Lammy•18m ago•0 comments

Future Long Range Assault Aircraft Officially Named MV-75 Cheyenne II

https://news.bellflight.com/en-US/264304-future-long-range-assault-aircraft-officially-named-mv-7...
1•uticus•19m ago•1 comments

Text of OS age verification bill (HR 8250) [pdf]

https://www.congress.gov/119/bills/hr8250/BILLS-119hr8250ih.pdf
5•asdfglkjh•22m ago•0 comments

Gravtory – crash-proof Python workflows on your existing database

1•vatryok•22m ago•0 comments

Slint 1.16 Released

https://slint.dev/blog/slint-1.16-released
1•jandeboevrie•23m ago•0 comments

Stakes high as Supreme Court set to rule on Monsanto's weed-killing pesticide

https://www.theguardian.com/us-news/2026/apr/16/supreme-court-monsanto-glyphosate
1•mitchbob•23m ago•0 comments

Fire risks and ugly designs are stalling EV charger adoption

https://restofworld.org/2026/ev-charger-backlash-fire-safety-aesthetics/
1•PaulHoule•23m ago•0 comments

Show HN: HyperFrames – Render Video from HTML via Chrome's BeginFrame API

https://github.com/heygen-com/hyperframes
4•bored_hacker•23m ago•0 comments

Ask HN: How to Launch First SaaS

1•nicck1•24m ago•1 comments

Djangocon EU: when SaaS is not allowed: shipping Django as a desktop app

https://reinout.vanrees.org/weblog/2026/04/16/7-django-as-desktop-app.html
1•jandeboevrie•26m ago•0 comments

Show HN: Claude Opus 4.7: Everything You Need to Know

1•anju-kushwaha•26m ago•1 comments

The Unpleasant Side of Life with Horses in Cities

https://www.newyorkalmanack.com/2021/02/the-unpleasant-side-of-life-with-horses-in-cities/
2•ohjeez•27m ago•1 comments

How a subsea cable is repaired

https://www.onesteppower.com/post/subsea-cable-repair
1•slicktux•27m ago•0 comments