Or alternatively, just be a skill versus a tool.
My “agents” already demo stuff all the time by just being prompted to do so. I have notations in my standard Agents.md for how I want my documentation, testing etc.
Run uvx showboat --help and
uvx rodney --help and use those
tools to demo the feature you built
The help text effectively doubles as a skill.I don't want to duplicate my skills into all those repos (and keep them updated) so I prefer the "uvx tool --help" pattern.
I saw an MCP I've set up on claude.ai show up in my local Claude Code MCP list the other day, it seems inevitable that there will be skills integration across environments as well at some point.
https://github.com/microsoft/playwright-cli
Different from the cli used for running tests etc that comes bundled with PlayWright
Sample use:
playwright-cli open https://demo.playwright.dev/todomvc/ --headed
playwright-cli type "Buy groceries"
playwright-cli press Enter
playwright-cli type "Water flowers"
playwright-cli press Enter
playwright-cli check e21
playwright-cli check e35
playwright-cli screenshotI didn't know about sh-session, is that documented anywhere?
Anyway LLMs don’t have underlying intent so maybe it is fine to just let them express what they can in Markdown?
It's basically an automated test, but at a higher abstraction level and with manual verification--using CLI tools rather than a test harness. Really great work!
- E2E testing of browser components
- Taking screenshots before and after and having Claude look at them to double check things
- Driving it with an API and CLI as a headless browser
Will definitely give Rodney a look.
Showboat documents look neater if there are single one-line commands that do something useful. Dumping a full Playwright script into a cell is less readable.
Showboat also has a special feature where you can embed an image directly in the document by running:
showboat image doc.md 'rodney screenshot'
The command you call should return a path to an image file as the last line of output. Rodney does exactly that.It may well turn out that Rodney is unnecessary and people find better patterns using Showboat with existing tools like playwright-cli - in which case it won't matter because Showboat and Rodney aren't coupled to each other at all.
Showboat is definitely the more significant of the two projects.
eliben•1h ago
It's also interesting that you've shifted to Go for your agent-coded CLI tools, Simon.
simonw•1h ago
... but then I'm mostly running them with "uvx name-of-tool" because it turns out Python's packaging infrastructure for binary tools is so good!
markusw•59m ago
eliben•59m ago
But I can definitely see how someone with `uv` muscle memory wants everything in the same command.
`uv` is the best thing that happened to the Python ecosystem since... I don't know... maybe Numpy.