Here's my own screenshot of it rendering my blog - https://bsky.app/profile/simonwillison.net/post/3mdg2oo6bms2... - it handles the layout and CSS gradiants really well, renders the SVG feed icon but fails to render a PNG image.
I thought "build a browser that renders HTML+CSS" was the perfect task for demonstrating a massively parallel agent setup because it couldn't be productively achieved in a few thousand lines of code by a single coding agent. Turns out I was wrong!
I have one project Claude is working on right now where I'm testing a setup to attempt to take myself more out of the loop, because that is the hard part. It's "easy" to get an agent to multiply your output. It's hard to make that scale with your willingness to spend on tokens rather than with your ability to read and review and direct.
I've ended up with roughly this (it's nothing particularly special):
- Runs a evaluator that evaluates the current state and assigns scores across multiple metrics.
- If a given score is above a given threshold, expand the test suite automatically.
- If the score is below a given threshold, spawn a "research agent" that investgates why the scores don't meet expectations.
- The research agent delivers a report, that is passed to an implementation agent.
- The main agent re-runs the scoring, and if it doesn't show an improvement on one or more of the metrics, the commit is discarded, and notes made of what was tried, and why it failed.
It takes a bit of trial and error to get it right (e.g. "it's the test suite that is wrong" came up early, and the main agent was almost talked into revising the test suite to remove the "problematic" tests) but a division sort of like this lets Claude do more sensible stuff for me. Throwing away commits feels drastic - an option is to let it run a little cycle of commit -> evaluate -> redo a few times before the final judgement, maybe - but it so far it feels like it'll scale better. Less crap makes it into the project.
And I think this will work better than to treat these agents as if they are developers whose output costs 100x as much.
Code so cheap it is disposable should change the workflows.
So while I agree this is a better demonstration of a good way to build a browser, it's a less interesting demonstration as well. Now that we've seen people show that something like FastRender is possible, expect people to experiment with similarly ambitious projects but with more thought put into scoring/evaluation, including on code size and dependencies.
Just the day(s) before, I was thinking about this too, and I think what will make the biggest difference is humans who posses "Good Taste". I wrote a bunch about it here: https://emsh.cat/good-taste/
I think the ending is most apt, and where I think we're going wrong right now:
> I feel like we're building the wrong things. The whole vibe right now is "replace the human part" instead of "make better tools for the human part". I don't want a machine that replaces my taste, I want tools that help me use my taste better; see the cut faster, compare directions, compare architectural choices, find where I've missed things, catch when we're going into generics, and help me make sharper intentional choices.
But for other projects, being able to scale with little or no human involvement suddenly turns some things that were borderline profitable or not possible to make profitable at all with current salaries vs. token costs into viable businesses.
Where it works, it's a paradigm shift - for both good and bad.
So it depends what you're trying to solve for. I have projects in both categories.
That's why taking a step back and seeing what's actually hard in the process and bad with the output, felt like it made more sense to chase after, rather than anything else.
FWIW I ran your binary and was pleasantly surprised, but my low expectations probably helped ;)
embedding-shape•4h ago
After three days, I have it working with around 20K LOC, whereas ~14K is the browser engine itself + X11, then 6K is just Windows+macOS support.
Source code + CI built binaries are available here if you wanna try it out: https://github.com/embedding-shapes/one-agent-one-browser
jacquesm•1h ago