Show HN: Score your GitHub repo for AI coding agents

6•danoandco•10h ago

Comments

danoandco•10h ago

OpenAI published an article and demo for scoring how well AI agents can work in a codebase (https://openai.com/index/harness-engineering/, https://www.youtube.com/watch?v=rhsSqr0jdFw). We turned it into a free tool anyone can use.

Paste any public GitHub repo (or connect a private one) and get a live score across seven dimensions: bootstrap setup, task entry points, test harnesses, lint gates, agent docs, structured documentation, and decision records. It clones the repo, runs static analysis, and scores each dimension 0-3 with evidence pulled from actual files. Takes about 60 seconds.

Some repos we scored:

PostHog: https://twill.ai/score/fd033516-628b-4c7c-8db6-d84e3f2737ba

Supabase: https://twill.ai/score/b2825715-6c3d-4de1-a21b-fc5d9b17103b

Codex: https://twill.ai/score/d7372d95-0501-4ad3-ae90-8f112ccafee0

The pattern we keep seeing: most repos lose points on agent-specific docs and decision records. Everything else tends to be decent.

We built this scorecard as a free tool because agent performance is bounded by repo structure, not just model quality.

Would love to hear what scores people get. And whether the rubric is missing anything.

RoxaneFischer1•9h ago

not sure about the decision records. seems ideal but no one does that in practice

danoandco•8h ago

true, i think the key thing is explaining somewhere in the repo "why" something was done. like the rationale for choosing X over Y service for instance.

maybe this record is just the git log, and the agent just needs to access the git log.

we'll see how that matures over time

Show HN: Pgit – A Git-like CLI backed by PostgreSQL

Show HN: Sub-millisecond VM sandboxes using CoW memory forking

Show HN: Fatal Core Dump – A debugging murder mystery played with GDB

Show HN: The Lottery of Life

Show HN: QCCBot – Android in a browser tab, with AI agent control

Show HN: Crust – A CLI framework for TypeScript and Bun

Show HN: I built an interactive 3D three-body problem simulator in the browser

Show HN: Horizon – GPU-accelerated infinite-canvas terminal in Rust

Show HN: N0x – LLM inference, agents, RAG, Python exec in browser, no back end

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Show HN: Claude Code skills that build complete Godot games

Show HN: CollabMD – Real-time multiplayer for local and Git-backed Markdown

Show HN: Thermal Receipt Printers – Markdown and Web UI

Show HN: Dump – easily share context with AI

Show HN: I built a message board where you pay to be the homepage

Show HN: AI Skills for Affiliate Marketing – Works with Claude, ChatGPT

Show HN: Soros – AI for geopolitical macro investing

Show HN: Sonder – self-hosted AI social simulation engine

Show HN: March Madness Bracket Challenge for AI Agents Only

Show HN: Hat v0.7.0 – Fast, local automatic file compression and conversion

Show HN: CodeLedger – deterministic context and guardrails for AI

Show HN: Lore – Local AI thought capture and recall that runs on your machine

Show HN: Score your GitHub repo for AI coding agents

Show HN: GitGlimpse – GitHub Action that generates UI/UX demos for your PRs

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

Show HN: M68k assembly emulator that runs in the browser

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

Show HN: Grape – AI note taking app

Show HN: What if your synthesizer was powered by APL (or a dumb K clone)?

Show HN: Score your GitHub repo for AI coding agents

Comments

Show HN: Pgit – A Git-like CLI backed by PostgreSQL

Show HN: Sub-millisecond VM sandboxes using CoW memory forking

Show HN: Fatal Core Dump – A debugging murder mystery played with GDB

Show HN: The Lottery of Life

Show HN: QCCBot – Android in a browser tab, with AI agent control

Show HN: Crust – A CLI framework for TypeScript and Bun

Show HN: I built an interactive 3D three-body problem simulator in the browser

Show HN: Horizon – GPU-accelerated infinite-canvas terminal in Rust

Show HN: N0x – LLM inference, agents, RAG, Python exec in browser, no back end

Show HN: Antfly: Distributed, Multimodal Search and Memory and Graphs in Go

Show HN: Claude Code skills that build complete Godot games

Show HN: CollabMD – Real-time multiplayer for local and Git-backed Markdown

Show HN: Thermal Receipt Printers – Markdown and Web UI

Show HN: Dump – easily share context with AI

Show HN: I built a message board where you pay to be the homepage

Show HN: AI Skills for Affiliate Marketing – Works with Claude, ChatGPT

Show HN: Soros – AI for geopolitical macro investing

Show HN: Sonder – self-hosted AI social simulation engine

Show HN: March Madness Bracket Challenge for AI Agents Only

Show HN: Hat v0.7.0 – Fast, local automatic file compression and conversion

Show HN: CodeLedger – deterministic context and guardrails for AI

Show HN: Lore – Local AI thought capture and recall that runs on your machine

Show HN: Score your GitHub repo for AI coding agents

Show HN: GitGlimpse – GitHub Action that generates UI/UX demos for your PRs

Show HN: Signet – Autonomous wildfire tracking from satellite and weather data

Show HN: M68k assembly emulator that runs in the browser

Show HN: GDSL – 800 line kernel: Lisp subset in 500, C subset in 1300

Show HN: Hackerbrief – Top posts on Hacker News summarized daily

Show HN: Grape – AI note taking app

Show HN: What if your synthesizer was powered by APL (or a dumb K clone)?