Show HN: Showboat and Rodney, so agents can demo what they've built

https://simonwillison.net/2026/Feb/10/showboat-and-rodney/

53•simonw•2h ago

Comments

eliben•1h ago

Very interesting! I encountered the problems these tools are trying to tackle just recently while trying to guide an agent into creating an in-browser tool for me. Closing the loop on a web interface isn't as simple as CLI-only tools. I should give this a try.

It's also interesting that you've shifted to Go for your agent-coded CLI tools, Simon.

simonw•1h ago

I'm dabbling with Go at the moment for small tools, mainly as an excuse to learn a new language but also because having a single standalone binary is convenient for shuttling these tiny little tools around.

... but then I'm mostly running them with "uvx name-of-tool" because it turns out Python's packaging infrastructure for binary tools is so good!

markusw•59m ago

If you're coming from the Python world, definitely. I find `go install github.com/simonw/rodney@latest` equally easy. :D Although you need the Go tooling installed, of course. But so much agree, Go is great for CLIs!

eliben•59m ago

Right, standalone binaries for CLI tools is great. And if one has Go installed, they can just `go run ...` any tool from its GitHub path, all installation/build/caching happens automagically (meaning the execution is immediate after the first run).

But I can definitely see how someone with `uv` muscle memory wants everything in the same command.

`uv` is the best thing that happened to the Python ecosystem since... I don't know... maybe Numpy.

saberience•1h ago

Sounds like both of these tools could be one shot by either Claude or Codex.

Or alternatively, just be a skill versus a tool.

My “agents” already demo stuff all the time by just being prompted to do so. I have notations in my standard Agents.md for how I want my documentation, testing etc.

simonw•1h ago

They kind of were one-shotted by Claude. The value is in coming up with a consistent design and good enough --help that you can prompt:

  Run uvx showboat --help and
  uvx rodney --help and use those
  tools to demo the feature you built

The help text effectively doubles as a skill.

markusw•56m ago

I guess it would still make sense to have "demo" and "browser-use" skills, so that the agent can reach for them proactively? I always try to remove as much friction as possible for myself, one little bit at a time.

simonw•44m ago

My problem is that I work in dozens of different repos generally using Claude Code for web, which doesn't have a way to install extra global skills yet.

I don't want to duplicate my skills into all those repos (and keep them updated) so I prefer the "uvx tool --help" pattern.

markusw•18m ago

That's actually one of the things that has kept me from using Claude Code web (that, and I often need a Chrome browser for the agent). But they must be working on it.

I saw an MCP I've set up on claude.ai show up in my local Claude Code MCP list the other day, it seems inevitable that there will be skills integration across environments as well at some point.

tardismechanic•1h ago

See also (the confusingly named) playwright-cli

https://github.com/microsoft/playwright-cli

Different from the cli used for running tests etc that comes bundled with PlayWright

Sample use:

  playwright-cli open https://demo.playwright.dev/todomvc/ --headed
  playwright-cli type "Buy groceries"
  playwright-cli press Enter
  playwright-cli type "Water flowers"
  playwright-cli press Enter
  playwright-cli check e21
  playwright-cli check e35
  playwright-cli screenshot

simonw•1h ago

Yeah that's an excellent option for this kind of thing too.

markusw•58m ago

Oh, I hadn't seen that one either, thanks for sharing. Here I am still using the Chrome Devtools MCP like a caveman. :D

toastal•1h ago

If agents can generate text so easily, why would they be limited to Markdown instead of reStructuredText, AsciiDoc, or LaTeX which have rich features that help users understand text? I can understand developers refusing to adopt proper formats for documentation, but this seems odd for the bots. It doesn’t even generate the correct syntax block in Markdown using “bash” instead of “sh-session”.

giancarlostoro•53m ago

I think its primarily because that is the most common formatting in every editor now? I could be wrong. Markdown has become the standard for README files for over a decade now.

toastal•46m ago

Winning a popularity contest doesn’t mean it’s good. That is the worst part of about these things as they just generate the most common denominator type code/tooling while also repeating anti-patterns/mistakes like the bash vs. sh-session/console issue I pointed out. Garbage in has been so much garbage out unfortunately.

giancarlostoro•43m ago

Never said it was good, just making an observation that Markdown is most likely to be available to render OOTB in more editors. I don't think Markdown is bad necessarily either. It's "good enough" for simple document.

simonw•47m ago

Markdown has the widest tool compatibility - GitHub renders it, so does VS Code and many other editors and file hosts.

I didn't know about sh-session, is that documented anywhere?

bee_rider•18m ago

I dunno. I’ve written a bit of LaTeX but does it really shine in this context? IMO the real advantage it has is that it can allow the user to express more complicated intents than Markdown (weird phrasing—my natural instinct was to call LaTeX more precise than Markdown, but Markdown is pretty precise for describing the type of file that it is good at…).

Anyway LLMs don’t have underlying intent so maybe it is fine to just let them express what they can in Markdown?

giancarlostoro•54m ago

I'll be sure to try these out. I've been building my own alternative to Beads with a concept called "gates" which do not let you close tasks as complete until a gate passes. Would love to throw these in as "gates" for my current workflow.

Hansenq•50m ago

I was a bit confused as to how everything works until I read it in detail. Really cool tools, but I think one thing that would help in the introduction is: saying explicitly that the generated .md document is for you (the user) to read through, observe the output of the CLI call, and ensure that the output matches what you would expect.

It's basically an automated test, but at a higher abstraction level and with manual verification--using CLI tools rather than a test harness. Really great work!

nzoschke•49m ago

go-rod has been instrumental to my agentic coding loops too. Some uses:

- E2E testing of browser components

- Taking screenshots before and after and having Claude look at them to double check things

- Driving it with an API and CLI as a headless browser

Will definitely give Rodney a look.

measurablefunc•46m ago

Google's antigravity does this automatically by creating Task & Walkthrough artifacts.

johnfn•21m ago

Out of curiosity, what is the advantage of using Rodney when Playwright has the same set of features and AI understands how to write a Playwright script very well?

simonw•18m ago

Maybe not a lot.

Showboat documents look neater if there are single one-line commands that do something useful. Dumping a full Playwright script into a cell is less readable.

Showboat also has a special feature where you can embed an image directly in the document by running:

  showboat image doc.md 'rodney screenshot'

The command you call should return a path to an image file as the last line of output. Rodney does exactly that.

It may well turn out that Rodney is unnecessary and people find better patterns using Showboat with existing tools like playwright-cli - in which case it won't matter because Showboat and Rodney aren't coupled to each other at all.

Showboat is definitely the more significant of the two projects.

The Singularity will occur on a Tuesday

The Switch to Linux and the Beginning of My Self-Hosting Journey

Show HN: Showboat and Rodney, so agents can demo what they've built

Simplifying Vulkan one subsystem at a time

Launch HN: Livedocs (YC W22) – An AI-native notebook for data analysis

Mathematicians disagree on the essential structure of the complex numbers

Show HN: Rowboat – AI coworker that turns your work into a knowledge graph (OSS)

Clean-room implementation of Half-Life 2 on the Quake 1 engine

Ex-GitHub CEO launches a new developer platform for AI agents

China's Data Center Boom: A View from Zhangjiakou (2025)

Markdown CLI viewer with VI keybindings

Qwen-Image-2.0: Professional infographics, exquisite photorealism

Google handed ICE student journalist's bank and credit card numbers

Show HN: Stripe-no-webhooks – Sync your Stripe data to your Postgres DB

The Evolution of Bengt Betjänt

Show HN: I made paperboat.website, a platform for friends and creativity

Oxide raises $200M Series C

Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs

Show HN: I built a macOS tool for network engineers – it's called NetViews

A brief history of oral peptides

Show HN: Multimodal perception system for real-time conversation

I started programming when I was 7. I'm 50 now and the thing I loved has changed

Parse, Don't Validate (2019)

Competition is not market validation

Show HN: Deadlog – almost drop-in mutex for debugging Go deadlocks

Semaglutide improves knee osteoarthritis independant of weight loss

Europe's $24T Breakup with Visa and Mastercard Has Begun

ClawHub

Vercel's CEO offers to cover expenses of 'Jmail'

Redefining Go Functions

Show HN: Showboat and Rodney, so agents can demo what they've built

Comments

The Singularity will occur on a Tuesday

The Switch to Linux and the Beginning of My Self-Hosting Journey

Show HN: Showboat and Rodney, so agents can demo what they've built

Simplifying Vulkan one subsystem at a time

Launch HN: Livedocs (YC W22) – An AI-native notebook for data analysis

Mathematicians disagree on the essential structure of the complex numbers

Show HN: Rowboat – AI coworker that turns your work into a knowledge graph (OSS)

Clean-room implementation of Half-Life 2 on the Quake 1 engine

Ex-GitHub CEO launches a new developer platform for AI agents

China's Data Center Boom: A View from Zhangjiakou (2025)

Markdown CLI viewer with VI keybindings

Qwen-Image-2.0: Professional infographics, exquisite photorealism

Google handed ICE student journalist's bank and credit card numbers

Show HN: Stripe-no-webhooks – Sync your Stripe data to your Postgres DB

The Evolution of Bengt Betjänt

Show HN: I made paperboat.website, a platform for friends and creativity

Oxide raises $200M Series C

Frontier AI agents violate ethical constraints 30–50% of time, pressured by KPIs

Show HN: I built a macOS tool for network engineers – it's called NetViews

A brief history of oral peptides

Show HN: Multimodal perception system for real-time conversation

I started programming when I was 7. I'm 50 now and the thing I loved has changed

Parse, Don't Validate (2019)

Competition is not market validation

Show HN: Deadlog – almost drop-in mutex for debugging Go deadlocks

Semaglutide improves knee osteoarthritis independant of weight loss

Europe's $24T Breakup with Visa and Mastercard Has Begun

ClawHub

Vercel's CEO offers to cover expenses of 'Jmail'

Redefining Go Functions