frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

6 Practices that turned AI from prototyper to workhorse (106 PRs in 14 days)

13•waleedk•3h ago
1. Specs and plans are source code: Specs and plans live in git alongside source code, not in chat history. A new agent reads arch.md for the big picture, then its specific spec. You always know why something was built.

2. Three models review every phase: Claude, Gemini, and Codex catch almost entirely different bugs. No single model found more than 55% of issues. If you only review with the model that wrote the code, you're missing half the bugs. 20 bugs caught before shipping. Claude Code found 5 bugs, Gemini and Codex caught another 15, including a severe security issue Claude missed.

3. Enforce the process, don't suggest it. A state machine forces Spec → Plan → Implement → Review → PR. The AI can't skip steps. Tests must pass before advancing. AIs don't stick to the plan by themselves, you need rails.

4. Annotate, don't edit. Most of the work is writing specs and reviews that guide the code, not hacking at files in an open-ended chat.

5. Agents coordinate agents. An architect agent spawns builder agents into isolated git worktrees. You direct the architect; it directs the builders. They message each other async.

6. Manage the whole lifecycle. Most AI tools help you write code faster — maybe 30% of the job. The other 70% is planning how, reviewing, integrating, deployment scripts, managing staging vs prod. Have AI run the whole pipeline from spec to PR and beyond.

Overall result: One engineer able to produce what a team of 3-4 would usually do. Measured 1.2 points better code on a 10 point scale vs claude code. Downsides: takes a lot longer, much more token usage, but still reasonable at $1.60 per PR.

We open sourced it: https://github.com/cluesmith/codev More details and raw results: https://cluesmith.com/blog/a-tour-of-codevos/

Comments

waleedk•3h ago
Happy to answer any questions. Here are those links as clickables:

Github: https://github.com/cluesmith/codev Tour + raw results: https://cluesmith.com/blog/a-tour-of-codevos/

trollbridge•1h ago
This original post looks AI-generated.

Could you share the prompts you used to generate it?

waleedk•56m ago
In a sense? This human built a system for AI to build stuff then asked the AI to summarize what the AI that built the human built?

It was more of a conversation, but it was like: Hey I wrote these 6 points about what we're doing differently, please tailor them to be most useful to an HN audience.

skydhash•1h ago
> Codev isn’t an AI model. It’s not a coding assistant. It’s not a VS Code extension. It’s a set of CLI tools, protocols, and infrastructure that orchestrates existing AI coding tools (Claude Code, Gemini CLI, OpenAI’s Codex CLI) into a structured workflow.

Thanks for the clarification, I couldn't have guessed otherwise.

waleedk•54m ago
Useful criticism -- what could I have done to help you get that message sooner?
ddoottddoott•1h ago
Would you rather fight 100 AI workhorses or 1 workhorse AI?
waleedk•57m ago
Ha! I would rather fight 100 workhorse AIs with an Architect + Builder AIs on my side :-).

Seriously, the agents managing agents thing works so well. When I'm working, I'll sometimes have 6 builder agents fixing different bugs, and I will lose state and I rely on the architect agent who doesn't have stupid limitations like 7 +/- 2 things in working memory.

yodon•1h ago
I'm a huge fan of spec-kit, and am actively looking for a replacement for it because spec-kit is no longer maintained by the team at GitHub.

Codev looks like it has a lot of good similarities to spec-kit, and like it's something I need to pay close attention to. That said, I'll encourage you to do another pass on your command names, intros, and cheat-sheet.

I suspect most developers using codev will mostly use a very small fraction of the codev commands most of the time, similar to the way spec-kit is mostly /specify, /plan, /tasks, and /implement, with a bit of /clarify and /analyze once you really get comfortable with it. If I'm right, having some docs where you emphasize the simplicity of your core flow would be very helpful.

For calibration, five minutes into reading your home page and medium post and some of your repo docs, I'm ready to believe this is true, but I have no idea what that core flow is or looks like. Five minutes is actually a pretty long time, and I suspect most visitors will end up bouncing if they don't get clarity on what the experience is ultimately going to be like for them in five minutes (or, more likely, much less than five minutes).

waleedk•1h ago
Yes, this is spec kit on steroids. In particular specs + protocol enforcement works _really_ well. The protocol enforcement is the game changer: I would find the AI just wouldn't stick to specs or plans.

Great suggestions. I will do that. Did you notice any specific issues in those?

Got it about the core flow. Appreciate it. I plan to record a video showing how to kick off a new project and another one showing how to use it in maintenance mode. Would that be helpful?

@yodon if you would like to reach out to me at hello@cluesmith.com I'd love to get your feedback once those assets are ready.

Ghostty – Terminal Emulator

https://ghostty.org/docs
437•oli5679•7h ago•192 comments

Microgpt

http://karpathy.github.io/2026/02/12/microgpt/
1504•tambourine_man•18h ago•267 comments

AWS Middle East Central Down, apparently struck in war

https://health.aws.amazon.com/health/status
67•earthboundkid•46m ago•14 comments

Why XML Tags Are So Fundamental to Claude

https://glthr.com/XML-fundamental-to-Claude
93•glth•5h ago•50 comments

A new Polymarket account made over $500k betting on the U.S. strike against Iran

https://twitter.com/cabsav456/status/2027937130995921119
76•doener•47m ago•50 comments

Microgpt explained interactively

https://growingswe.com/blog/microgpt
9•growingswe•10h ago•0 comments

Decision trees – the unreasonable power of nested decision rules

https://mlu-explain.github.io/decision-tree/
317•mschnell•11h ago•57 comments

We do not think Anthropic should be designated as a supply chain risk

https://twitter.com/OpenAI/status/2027846016423321831
742•golfer•22h ago•406 comments

Python Type Checker Comparison: Empty Container Inference

https://pyrefly.org/blog/container-inference-comparison/
27•ocamoss•4d ago•14 comments

Flightradar24 for Ships

https://atlas.flexport.com/
124•chromy•9h ago•31 comments

How Dada Enables Internal References

https://smallcultfollowing.com/babysteps/blog/2026/02/27/dada-internal-references/
14•vrnvu•2d ago•5 comments

I built a demo of what AI chat will look like when it's "free" and ad-supported

https://99helpers.com/tools/ad-supported-chat
363•nickk81•8h ago•220 comments

Interview with Øyvind Kolås, GIMP developer (2017)

https://www.gimp.org/news/2026/02/22/%C3%B8yvind-kol%C3%A5s-interview-ww2017/
89•ibobev•3d ago•37 comments

Lil' Fun Langs' Guts

https://taylor.town/scrapscript-001
25•surprisetalk•4h ago•2 comments

When does MCP make sense vs CLI?

https://ejholmes.github.io/2026/02/28/mcp-is-dead-long-live-the-cli.html
100•ejholmes•3h ago•75 comments

Show HN: Audio Toolkit for Agents

https://github.com/shiehn/sas-audio-processor
20•stevehiehn•4h ago•2 comments

10-202: Introduction to Modern AI (CMU)

https://modernaicourse.org
181•vismit2000•12h ago•44 comments

New iron nanomaterial wipes out cancer cells without harming healthy tissue

https://www.sciencedaily.com/releases/2026/02/260228093456.htm
139•gradus_ad•5h ago•44 comments

Aromatic 5-silicon rings synthesized at last

https://cen.acs.org/materials/inorganic-chemistry/Aromatic-5-silicon-rings-synthesized/104/web/20...
59•keepamovin•2d ago•26 comments

The real cost of random I/O

https://vondra.me/posts/the-real-cost-of-random-io/
74•jpineman•3d ago•14 comments

Switch to Claude without starting over

https://claude.com/import-memory
490•doener•12h ago•227 comments

Gzpeek: Tool to Parse Gzip Metadata

https://evanhahn.com/introducing-gzpeek/
3•ingve•2d ago•0 comments

Why is the first C++ (m)allocation always 72 KB?

https://joelsiks.com/posts/cpp-emergency-pool-72kb-allocation/
102•joelsiks•10h ago•19 comments

An ode to houseplant programming (2025)

https://hannahilea.com/blog/houseplant-programming/
115•evakhoury•2d ago•24 comments

Obsidian Sync now has a headless client

https://help.obsidian.md/sync/headless
553•adilmoujahid•1d ago•184 comments

January in Servo: preloads, better forms, details styling, and more

https://servo.org/blog/2026/02/28/january-in-servo/
25•birdculture•2h ago•2 comments

Rydberg atoms detect clear signals from a handheld radio

https://phys.org/news/2026-02-rydberg-atoms-handheld-radio.html
65•Brajeshwar•2d ago•23 comments

Robust and efficient quantum-safe HTTPS

https://security.googleblog.com/2026/02/cultivating-robust-and-efficient.html
80•tptacek•2d ago•17 comments

The happiest I've ever been

https://ben-mini.com/2026/the-happiest-ive-ever-been
616•bewal416•3d ago•336 comments

Show HN: Vertex.js – A 1kloc SPA Framework

https://lukeb42.github.io/vertex-manual.html
23•LukeB42•9h ago•16 comments