frontpage.

Show HN : Pilot – System to improve dramatically your AI coding

https://github.com/clementrog/pilot

2•crog•3w ago

I'm a non-technical guy who spent 2 months trying to ship software with AI tools. Not toy projects — real things I wanted to use. Finance analyzers, productivity tools, dev utilities.

The models are incredible. But the loop was broken.

Every session started from zero. Context would explode. The AI would hallucinate with confidence. And because I can't read code, I had no way to verify when something was wrong. I just knew it was broken. So I stopped fighting the model and started building the system around it.

Pilot is a /pilot folder you drop into any repo. It's emergent complexity from simple primitives — markdown files that give AI tools:

Persistent state (STATE.md tracks where you are in the workflow) Scoped tasks (TASK.md defines boundaries before implementation) Evidence capture (real terminal output via MCP, not generated text) Protected paths (red zones require human approval) Recovery (LKG commit auto-updated after health passes)

The core insight: split the AI into two roles. Orchestrator (Claude/ChatGPT) — high reasoning, low volume. Writes specs, reviews evidence, manages flow. Builder (Cursor/Claude Code) — high volume, lower cost. Implements, provides proof. The Orchestrator defines scope before the Builder touches anything. The Builder works within boundaries. The Orchestrator reviews after. Two models, two verification passes. It's moving from "trust me" to "show me the terminal."

Why I needed this: I wanted to program by intuition, not by syntax. I can design systems. I can spec features. I can verify that tests pass and URLs work. What I can't do is read 200 lines of generated TypeScript and know if it's correct. So the system had to prove correctness without requiring code review. Evidence-based commits. Scope contracts. Clear rejection criteria. It's shared intuition for messy realities. Not a sandbox — I know markdown isn't a firewall. It's defense in depth: separation of concerns, multi-model review, explicit rules, human gates.

Technical notes: The workflow is a state machine: idle → building → verifying → done. Evidence comes from MCP-captured terminal output. The Orchestrator validates Builder output against TASK.md constraints. Red zone violations trigger automatic escalation. The /pilot folder is just markdown. Any MCP-enabled tool can read it. No vendor lock-in.

Limitations (being honest): Solo builder workflow. Team use needs merge strategy for state files. Convention-based, not filesystem-enforced. If you need true isolation, run in a container. Context can still drift if you skip the workflow. Health checks help, but it's not foolproof. Token overhead exists. Trading cost for correctness insurance.

What I've built with it: Private projects mostly — finance analyzer, productivity tools, Framer components, and Pilot itself. Iterating on the workflow every time I hit a wall until the walls stopped appearing.

Now using it on bigger things I plan to release.

Felt too good not to share.

Happy to discuss the architecture, failure modes, or specific edge cases.

Sebastian Galiani on the Marginal Revolution

Ask HN: Are we at the point where software can improve itself?

Binance Gives Trump Family's Crypto Firm a Leg Up

Reverse engineering Chinese 'shit-program' for absolute glory: R/ClaudeCode

Indian Culture

Show HN: Maravel-Framework 10.61 prevents circular dependency

The age of a treacherous, falling dollar

Ask HN: AI Generated Diagrams

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

Show HN: A delightful Mac app to vibe code beautiful iOS apps

Show HN: Gemini Station – A local Chrome extension to organize AI chats

Welfare states build financial markets through social policy design

Market orientation and national homicide rates

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

Show HN: Pyrig – One command to set up a production-ready Python project

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

C and C++ dependencies: don't dream it, be it

Show HN: Vbuckets – Infinite virtual S3 buckets

Open Molten Claw: Post-Eval as a Service

New York Budget Bill Mandates File Scans for 3D Printers

The End of Software as a Business?

Exploring 1,400 reusable skills for AI coding tools

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy