I wanted to see how far I could go black-boxing the app with AI. I expected a weekend of work, but getting it right took:
- Three weekends
- ~ $150 in Cursor spend
- $50 for asset creation (Layer.ai)
Core learnings:
- No single model or provider is sufficient at this point. Opus + GPT 5.4 for planning. Cursor Composer for coding. Sonnet for review and more difficult stuff. Gemini 3.1 for UI and UX
- I spent probably twice as much in time and tokens as I needed to - setting up a Figma-Claude MCP feedback loop would have saved me a lot of work on UI tweaks and revisions. Knowing what I wanted when I started would (obviously) have helped too.
- Layer.ai was worth it for asset generation. I'll bet I could have figured out prompts given enough time, but I was able to just jump in and make the visuals and sprites I wanted for cheap.
- It's worth building out skills and commands, even for throwaway projects.
- Cursor Composer is worth staying on Cursor instead of just moving over to Claude Code/Desktop/etc. Really straightforward, clever solutions to some problems, doesn't editorialize my UI text
- As a counterpoint to Composer, I can't tell you how often I'd have Sonnet do something, and then find an extra paragraph or two of UI text on a screen where Sonnet thought maybe we needed to tell the user exactly how scoring works.
- I need to learn some more sea shanties.
Final notes:
- I'm an SDET, and in one way black-boxing with AI honestly just feels like another day for me - work with the PM for requirements, send to the dev team, test, repeat. On the other hand, having an agent build a 10k scenario simulation engine for tuning the game parameters in 10 minutes was mind-blowing.
- AI changes the calculation of what's worth my time. 6 months ago I would have laughed about the idea of HT, then went on with my day.