The actual code for this is mostly what I experiment with to basically scale this up - but if you prompt your agent right you can literally use it as a simple prompt in your repo today, I personally use it in Antigravity as a workflow:
Løp Workflow: [truncated 10K chars, look in Biolab Repo]
So… How can I use it?
Step 1: Open your repo, I personally advice using a capable model for this step since I have seen a lot of laziness from GLM 4.7 and I assume smaller models will behave similar - but to put it bluntly: Send this workflow to Opus tell it to define „Slices“ for your codebase and write the /specs for each, tell it to continue until its finished
Step 2: Choose a spec and send this prompt: @your workflow file / your workflow context whatever
Lets do a digestable slice of improvements and code to spec alignment! - Review the projects.spec - Review all related components comprehensively assessing the current implementation and code based on REAL code reads - based on the spec sheet and the code - compare the both and write a implementation plan to address uncovered gaps functionality wise or otherwise and compile refinement/improvement/nexxttasks - Review the newly added code, test compilation / no new errors and update the spec to reflect the latest REAL code state, report to me how well it meets the Specs and line out next steps Keep up: spec <-> code? Review the spec, REVIEW all related code. Keep both in sync. Please fill all identifyable gaps and address tasks.
Step 3: You basically just loop until the Specs and Code align. You will notice that the agent will tell you „the spec and code are aligned“ instead of engineering the F* out of your code
Step 4: You know have functional slices of your codebase and you can now take the entirety of your specs (its not that much) -> send it to an SOTA LLM -> „What gaps are in my Spec“
Step 5: Take the Gap and fill it, I use this prompt:
[Put the task here]
- Review the relevant spec in /frontend/specs/ and - Review all related components comprehensively assessing the current implementation and code based on REAL code reads - based on the spec sheet and the code - assess the validity of the task and formulate an implementation plan - Review the newly added code, test compilation / no new errors and update the spec to reflect the latest REAL code state, report to me how well it meets the Specs and line out next steps Keep up: spec <-> code? Review the spec, REVIEW all related code. Keep both in sync. Please fill all identifyable gaps and address tasks.
The aftermath
So you are probably familiar with let me implement this one thing/refactor this/add these features and you end up grinding through 500 type issues until you get a somewhat working codebase again? This is what I get:
The Numbers
| Metric | Value | |--------|-------| | Parallel agents | 3 | | Files changed | 149 | | Lines added | +3,014 | | Lines removed | -2,881 | | Domain specs in repo | 47 | | Conflicts | 0 | | Agent communication | 0 | | Orchestration code | 0 lines |
Changes by Directory
frontend/server: +1,301 -640 frontend/app: +1,269 -1,687 frontend/specs: +416 -471
npx tsc: Found 17 Errors in 6 files
Repo (WIP) I am using this on (I have only started applying this pattern ~2 days ago)
https://github.com/Mvgnu/BioLabs
Does it scale So far I have yet to find the limit. If your code does not work you likely only need more loops against the spec. This also works in Claude Assistant Chat ironically - which produced the Løp. repo code