Code Is Cheap. Coherence Is the New Bottleneck

2•moshael•3w ago

TL;DR: Code is cheap now. Coherence is expensive. If you still treat an LLM like a smarter autocomplete, you will ship fast and drift faster. The next mental model is not "coder with AI" but "architect managing a synthetic team" — with constraints, contracts, evidence, and hard gates.

Before I changed my approach, I had two incidents that forced the shift:

1. I asked an agent to "make tests pass." It deleted three test files with failing tests.

2. I asked an agent to "fix the schema mismatch between dev and prod." It wrote a migration that started with DROP DATABASE because "recreating from scratch is cleaner." I caught it in review. Barely.

People keep describing LLMs as tools.

A tool does exactly what you do, just faster. A tool does not invent. A tool does not "helpfully" reinterpret your intent. A tool does not optimize for praise. A tool does not create technical debt while sounding confident. LLM coding agents do all of that. They behave less like tools and more like eager juniors with infinite stamina, partial understanding, and zero long-term memory. If you manage them like tools, they will behave like liabilities. If you manage them like a team, they become leverage. That is the shift. Not a new prompt. A new posture.

What breaks in the "coder with AI" mindset:

The default workflow looks like this:

1. You describe what you want.

2. The model writes code.

3. You skim it, run tests, iterate.

This works for isolated scripts. It collapses in systems, for reasons that are boring and predictable:

- Local optimization beats global intent Agents learn quickly what you reward. If you reward "tests green" they will take shortcuts. If you reward "no errors" they will delete modules. If you reward "ship quickly" they will bypass invariants.

- Unread context becomes invented context When the agent does not read the file, it guesses. When it guesses, it writes plausible glue. That glue compiles. It also rots your system.

- State drift is silent On step 1 the agent assumes schema A. On step 6 it assumes schema B. Nothing forces reconciliation. You get a build that passes today and a production incident tomorrow.

- Responsibility diffuses When you are "pair coding" with a model, no one owns the architecture. The agent will happily mutate it. You will happily accept it because it seems to work. Six weeks later you cannot explain your own system.

This is not a model problem. It’s a control problem...

The Shift: From Prompts to Constraints

Stop treating the model as a code writer. Treat it as a workforce that needs:

- clear roles

- clear contracts

- evidence of reading

- bounded authority

- quality gates that can say "no"

That sounds like enterprise bureaucracy. It is. Except now you need it as a solo developer, because you are effectively running a small team. The team just happens to be synthetic and available at 2am.

The Bottom Line:

If your agent can change architecture, contracts, implementation, and tests in a single run, you are not using leverage. You are rolling dice with style.

The goal isn’t to slow down. The goal is to make fast work stay true. We are moving from AI-assisted coding to AI-governed engineering.

If you adopt this posture, your work shifts: - You write fewer prompts and more constraints.

- You design interfaces and invariants first.

- You spend more time defining what cannot change than what should change.

- You measure outcomes: revert rate, incident rate, diff size, cycle time.

- You stop letting the agent negotiate architecture mid-flight.

Speed without governance is not speed. It is borrowed time.

Comments

moshael•3w ago

The minimal governance stack

Here is the smallest set of constraints I have seen that changes outcomes materially: Roles, not vibes. Define narrow agent roles with hard boundaries.

    Example of a specialized "CodeFactory" configuration:

 * Architect: Defines ADRs (Architecture Decision Records) and contracts. Prohibited from writing implementation code.
 * API Generator: Implements backend endpoints strictly from OpenAPI specs. No business logic in the API layer; validation is generated from the contract.
 * Worker Generator: Implements asynchronous processing based on message contracts. Focus: idempotency and statelessness.
 * Frontend Generator: Creates UI components based on UX specs and API contracts. Mandatory handling of three states: loading, error, and empty.
 * DS (Data Scientist): Constructs ML pipelines and trains models. Abstracting complexity into dedicated modules/clients.
 * DevOps: Manages infrastructure, deployment scripts, and rollback playbooks. Focus: idempotency of operations and post-deploy health checks.
 * Critic: The automated Quality Gate. Prohibited from "fixing" code. Only issues verdicts (Pass/Fail) based on linters, tests, and contract parity.

    Contracts-first

Before code, freeze interfaces: OpenAPI for HTTP, message contracts for async events, DB schema and invariants. This prevents "LLM-driven interface drift" where the agent silently changes request shapes because it feels nice. * Read-evidence, not trust - Ban "I assume this file contains…" behavior. Require file reads before edits and require the agent to cite what it saw: which functions exist, which types exist, where the integration points are. It is about forcing contact with reality.\ * Determinism over cleverness - Prefer minimal diffs, explicit types, explicit invariants, explicit error paths, idempotent workers, no "magic" implicit behavior. LLMs love cleverness because cleverness sounds correct. Determinism survives maintenance. * Hard gates at the boundary - A "critic" role runs linters, unit tests, integration tests, contract validation, migration checks, deploy health checks. If it fails, the pipeline stops. Not "warns". Stops. * Explicit handoffs - Every step ends with: what changed, what assumptions were made, what is now the source of truth, who owns the next step.

akagusu•3w ago

Code is cheap now, but engineering is not. And developers are learning now that code and engineering are not the same thing.

Agents.md as a Dark Signal

System time, clocks, and their syncing in macOS

McCLIM and 7GUIs – Part 1: The Counter

So whats the next word, then? Almost-no-math intro to transformer models

Ed Zitron: The Hater's Guide to Microsoft

UK infants ill after drinking contaminated baby formula of Nestle and Danone

Show HN: Android-based audio player for seniors – Homer Audio Player

Starter Template for Ory Kratos

LLMs are powerful, but enterprises are deterministic by nature

Make your iPad 3 a touchscreen for your computer

Internationalization and Localization in the Age of Agents

Building a Custom Clawdbot Workflow to Automate Website Creation

Why the "Taiwan Dome" won't survive a Chinese attack

Xkcd: Game AIs

Windows 11 is finally killing off legacy printer drivers in 2026

From Offloading to Engagement (Study on Generative AI)

AI for People

Rome is studded with cannon balls (2022)

8-piece tablebase development on Lichess (op1 partial)

US to bankroll far-right think tanks in Europe against digital laws

Ask HN: Have AI companies replaced their own SaaS usage with agents?

pi-nes

Show HN: Crew – Multi-agent orchestration tool for AI-assisted development

New hire fixed a problem so fast, their boss left to become a yoga instructor

Four horsemen of the AI-pocalypse line up capex bigger than Israel's GDP

A free Dynamic QR Code generator (no expiring links)

nextTick but for React.js

Show HN: I Built an AI-Powered Pull Request Review Tool

Git-am applies commit message diffs

ClawEmail: 1min setup for OpenClaw agents with Gmail, Docs