It fucking doesn't.
You run it in a real app and immediately hit the same bullshit wall every time: - Hallucinated logic only reveals itself under real data or edge cases - UI updates magically forget to sync across devices (mobile → web = sad trombone) - API calls quietly return 401s or other crap that gets swallowed in some lazy try-catch - Vision-based agents crawl like molasses (2–10s per action) and torch tokens like it's free - Background pings and unrelated fetches make it impossible to tell what actually caused what
I tried pretty much everything out there and none of it quite scratched the itch I had: fast, structured, cross-platform runtime visibility without vision bloat or having to wire up a ton of hooks.
Quick rundown of the usual suspects:
- Pure vision/computer-use (Claude 3.5/4, ADEPT-style): zero setup, works on anything — but latency from hell and token burn is brutal for anything longer than a demo - Playwright / browser MCP servers: fast and structured for web — but web-only, selectors shatter like glass, no native mobile - Appium + vision hybrids: cross-platform on paper — but still vision-dependent and setup is a pain - Sandboxed agents (OpenHands, SWE-agent): decent for repo tasks and shell stuff — not so much for live app UI/network state - Explicit hooks/bridges: precise when you bother adding them — but requires code changes, which sucks
Couldn't find anything that gave me low-latency structured JSON state (UI elements, network, errors, logs) across platforms, local-first, without the usual trade-offs. So yeah, I got fed up and built a small local MCP server to solve it for myself.
Full disclosure: it's called Autonomo MCP https://github.com/sebringj/autonomo — very early, just launched.
I don't usually do this "I built a thing" thing — my open-source contributions are mostly small fixes and PRs — but I honestly couldn't see a better way in the current landscape.
It is my hope that Anthropic (or someone) will eventually ship a clean native solution for this. They already fixed BM25 tool calling to shrink context like crazy; I'd love to see them (or the industry) make runtime validation "just work" out of the box too.
Sometimes when you code in a vacuum you think your shit smells good. lmk if I'm off base here, I grew up with a mean grandpa so I'm cool with it.
GahLak•1h ago
sebringj•1h ago