I’ve been using Cursor heavily along the way. The models are genuinely good and the local code they generate is often excellent, but on a large, evolving codebase I kept running into the same problem: context limits caused subtle architectural drift. The AI would write clean functions that were globally wrong, quietly breaking earlier design decisions and long‑range invariants.
What finally helped was to stop treating “AI” as a single assistant and instead treat different models as different team members with clear roles and constraints.
My current setup looks like this:
Perplexity + ChatGPT → “product / research brains” I use them for requirements, trade‑offs, and high‑level architecture sketches. They live outside the IDE and exist to clarify what I’m building and why before any code is touched.
Cursor, window 1 (GPT‑5.2) → “architect” This instance is not allowed to write production code. It is responsible for architecture and module boundaries, writing design notes and developer guides, defining interfaces and contracts, and reviewing diffs. I treat it like a senior engineer whose main output is prose: mini‑RFCs, comments, and checklists.
Cursor, window 2 (Sonnet 4.5) → “programmer” This one only implements tasks described by the architect: specific files, functions, and refactors, following explicit written instructions and style rules. It doesn’t get to redesign the system; it just writes the code.
The key rule is: architect always goes first. Every non‑trivial change starts as text (design notes, constraints, examples), then the “programmer” instance turns that into code.
This simple separation fixed a lot of the weirdness I was seeing with a single, all‑purpose assistant. There is much less logical drift, because the global structure is repeatedly restated in natural language. The programmer only ever sees local tasks framed inside that structure, so it’s harder for it to invent a new accidental architecture. The codebase, despite being tens of thousands of lines, feels more coherent than earlier, smaller iterations.
It also changed how I think about Cursor. Many of my earlier “Cursor is dumb” moments turned out to be workflow problems: I was asking one agent, under tight context limits, to remember architecture, requirements, and low‑level implementation all at once. Once I split those responsibilities across different models and forced everything through written instructions, the same tools started to look a lot more capable.
This isn’t a Cursor ad, and it’s not an anti‑Cursor rant either. It’s just one way to make these tools work on a large solo project by treating them like a small team instead of a single magical pair‑programmer.
One downside of this setup: at my current pace, Cursor is happily charging me something like $100 a day. If anyone from Cursor is reading this – is there a “solo dev building absurdly large systems” discount tier I’m missing?
zkmon•6h ago
garylauchina•3h ago
A typical loop looks like this:
I talk with Perplexity/ChatGPT to clarify the requirement and trade‑offs.
The “architect” Cursor window writes a short design note: intent, invariants, interfaces, and a checklist of concrete tasks.
The “programmer” Cursor window implements those tasks, one by one, and I run tests / small experiments.
If something feels off, I paste the diff and the behavior back to the architect and we adjust the design note, then iterate.
There is a feedback channel from “programmer” to “architect”, but it goes through me. When the programmer model runs into something that doesn’t fit (“this API doesn’t exist”, “these two modules define similar concepts in different ways”, etc.), I capture that as:
comments in the code (“this conflicts with X”), and
updates to the architecture doc (“rename Y to Z, merge these two concepts”, “deprecate this path”, etc.).
So the architect is not infallible. It gets corrected by reality: tests failing, code being awkward to write, or new edge cases showing up. The main thing the process enforces is that those corrections are written down in prose first, so future changes don’t silently drift away from whatever the system used to be.
In that sense the “programmer” complains the same way human ones do: by making the spec look obviously wrong when you try to implement it.