Most CUA frameworks are either a black box or raw tool calls with no structure. Orbit sits in between, natural language controls the screen, Orbit lets Python control the flow. Each step has its own model, budget, and typed output, but shares context across the session. Mix cheap and expensive models per step, extract structured data from any screen into Pydantic models, and steer the agent mid-task when it struggles. Built on the OS accessibility tree, not screenshots.
https://github.com/aadya940/orbit