frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Cua Driver – background multi-cursor via macOS SkyLight.framework

https://github.com/trycua/cua
2•frabonacci•1h ago

Comments

frabonacci•1h ago
Hi HN, Francesco from Cua here. I hacked this together over a weekend after getting curious about whether macOS could support real background computer-use outside a single vendor's agent product.

The first thing we are using it for is recording product demos. We used to use Screen Studio; now we ask Claude Code + cua-driver to drive the app while cua-driver recording start captures the trajectory, screenshots, actions, and click markers. We canceled our Screen Studio subscription, which started as a joke and then became true.

The problem: most GUI agents still assume the desktop has one shared cursor, one focused app, and one human who is okay being interrupted. That makes local desktop agents awkward. The agent can do the task, but it steals your screen while doing it.

cua-driver is our attempt to make background computer-use a commodity primitive for macOS: let an agent drive a real Mac app while your cursor, focus, and Space stay where they are. The default interface is a CLI, so it is easy to script, easy for coding agents to call from a shell, and still compatible with MCP clients when you want that.

You can try it on macOS 14+:

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/trycua/cua/main/libs/cua-d...)" CLI example:

cua-driver serve &

cua-driver recording start ~/cua-trajectories/demo1

cua-driver launch_app '{"bundle_id":"com.apple.calculator"}'

cua-driver list_windows '{"pid":12345}'

cua-driver get_window_state '{"pid":12345,"window_id":67890}'

cua-driver click '{"pid":12345,"window_id":67890,"element_index":14}'

cua-driver recording stop

The recording command writes turn-NNNNN/ folders with the post-action app state, screenshot, action JSON, and a click.png marker overlay for click-family actions. You can replay a saved run with cua-driver replay_trajectory '{"dir":"~/cua-trajectories/demo1"}', which is useful for regression captures even when you are not trying to make a polished marketing video.

What made this harder than expected:

- CGEventPost warps the cursor (it goes through the HID stream, same one your physical mouse uses)

- CGEvent.postToPid doesn't warp the cursor but Chromium silently drops the event at the renderer IPC boundary

- Activating the target first raises the window AND drags you across Spaces on multi-monitor setups

- Electron apps stop keeping useful AX trees alive when their windows are occluded, unless you register the observer through a private remote-aware SPI

The unlock was a private Apple framework called SkyLight. SLEventPostToPid is a sibling of the public per-pid call, but it travels through a WindowServer channel Chromium accepts as trusted. Pair it with yabai's focus-without-raise pattern (two SLPSPostEventRecordTo calls, deliberately skip SLPSSetFrontProcessWithOptions) plus an off-screen primer click at (-1, -1) to tick Chromium's user-activation gate, and the click lands without the window ever raising.

The thing we learned while building it: the primary addressing mode should not be pixels. cua-driver exposes ax, vision, and som (set-of-marks) modes, but element-indexed AX actions are the happy path. Pixels are the fallback for canvas/WebGL/video surfaces. That makes agents much less brittle because they can click "the Send button" instead of guessing coordinates, while still having a screenshot when the AX tree is ambiguous.

Other things we have used it for:

- A dev-loop QA agent that reproduces a visual bug, edits code, rebuilds, and verifies the UI while my editor stays frontmost

- A personal-assistant style flow that sends a Messages reply without switching Spaces

- Pulling visual context from Chrome/Figma/Preview/YouTube windows I am not looking at

Long technical writeup: https://github.com/trycua/cua/blob/main/blog/inside-macos-wi...

I would especially like feedback from people building Mac automation, agent harnesses, MCP clients, or accessibility tooling. If you try it and it breaks on an app you care about, that is useful data.

Tensorlake is now an official Harbor environment runtime

https://www.tensorlake.ai/blog/tensorlake-harbor-environment-runtime
1•cooleel•3m ago•0 comments

AI Field Notes on the DGX Spark

https://manavsehgal.github.io/ai-field-notes/
1•manavsehgal2025•3m ago•0 comments

The Art of Crossword Creation

https://llama.gs/blog/index.php/2026/04/24/the-forgotten-art-of-crossword-creation/
2•major4x•4m ago•0 comments

Ask HN: I wanna hear about your experience with Claude Code and Codex

1•redaantar•4m ago•0 comments

Full Stack Open: Deep Dive into Modern Web Development

https://fullstackopen.com/en/
1•eustoria•4m ago•0 comments

Zodiac Killer may be tied to Black Dahlia case after 'code cracked,' DNA taken

https://www.foxnews.com/us/zodiac-killer-may-tied-black-dahlia-case-code-cracked-new-suspect-emerges
2•keepamovin•4m ago•0 comments

PIX – Share Images Without the Cloud

https://www.fainimade.blog/2026/03/pix-share-images-without-cloud.html
2•eustoria•7m ago•0 comments

Retrieval-Augmented Generation Is an Engineering Problem, Not a Model Problem

https://www.forbes.com/councils/forbestechcouncil/2026/04/24/retrieval-augmented-generation-is-an...
2•jamesgill•8m ago•0 comments

Variant – Endless designs for your ideas, just scroll

https://variant.com/
2•eustoria•8m ago•0 comments

Intel shutters open-source evangelism program, archives key community projects

https://www.tomshardware.com/software/intel-shutters-open-source-evangelism-program-and-archives-...
2•maxloh•10m ago•0 comments

A Catechism for Robots

https://kk.org/thetechnium/a-catechism-for-robots/
2•rafaelc•11m ago•0 comments

Show HN: Porting Open3D to Python without writing a LoC

https://chico.dev/Mirror-Bridge-Open3D-71-Lines/
2•fthiesen•13m ago•0 comments

Tesla (TSLA) discloses $2B AI hardware company acquisition buried

https://electrek.co/2026/04/23/tesla-tsla-quietly-discloses-2-billion-ai-hardware-acquisition-10q/
5•Bender•13m ago•1 comments

AI models, power, politics, and performance

https://dominiccummings.substack.com/p/1-ai-models-power-politics-and-performance
2•nowflux•14m ago•0 comments

A deep dive into the wild world of GitHub Actions' tagging formats

https://www.jvt.me/posts/2026/04/24/github-actions-tagging/
2•Brajeshwar•15m ago•0 comments

Relatives of dead or missing scientists grapple with impact of wild speculation

https://www.bbc.com/news/articles/cwyw9rpdl4po
4•cf100clunk•16m ago•1 comments

How do you handle context compression cloud workflows?

https://cloudgo.ai/
2•gtram20•18m ago•1 comments

'Scattered Spider' Member 'Tylerb' Pleads Guilty

https://krebsonsecurity.com/2026/04/scattered-spider-member-tylerb-pleads-guilty/
2•Bender•18m ago•0 comments

JackDanger/gzippy ·The fastest gzip on any hardware

https://github.com/JackDanger/gzippy
2•pkaeding•19m ago•0 comments

Redesigning the Recurse Center application to inspire curious programmers

https://www.recurse.com/blog/192-redesigning-the-recurse-center-application
2•nicholasjbs•20m ago•0 comments

Which one is more important: more parameters or more computation? (2021)

https://parl.ai/projects/params_vs_compute/
3•jxmorris12•20m ago•0 comments

Show HN: Claude proxy to record interactions-browse, search sessions, usage, MCP

https://github.com/tillahoffmann/cctape
2•tillahoffmann•21m ago•1 comments

Oral Argument Preview: Chatrie vs. United States

https://www.lawfaremedia.org/article/oral-argument-preview--chatrie-v.-united-states
2•hn_acker•24m ago•0 comments

I built PixelGuard – a privacy tool to blur faces in videos

https://pixelguard.video/
2•mindgeek002•24m ago•2 comments

Why BookScan Is Different from Book Sales (Different from Royalty Statements)

https://countercraft.substack.com/p/why-bookscan-is-different-from-book
2•crescit_eundo•25m ago•1 comments

AI Progress doesn't feel as fast as we're told

https://backnotprop.com/blog/ai-progress-doesnt-feel-as-fast-as-were-told/
3•ramoz•26m ago•1 comments

Ask HN: Is code quality and design systems the new SWE?

https://old.reddit.com/r/cscareerquestions/comments/1sukvlf/is_learning_code_quality_and_design_s...
2•SantiDev•28m ago•1 comments

Tiny 1000bhp 13Kg YASA Motor Cuts 200kg from EVs [video]

https://www.youtube.com/watch?v=B2Hl4c1iZK0
3•Lio•28m ago•1 comments

Tenth Circuit Broadens CFAA 'Loss' Beyond Technological Harm–Moxie vs. Nielsen

https://blog.ericgoldman.org/archives/2026/04/tenth-circuit-broadens-cfaa-loss-beyond-technologic...
1•hn_acker•29m ago•0 comments

Rust-coreutils – Program Security Assesment [pdf]

https://github.com/Zellic/publications/blob/master/uutils%20coreutils%20-%20Zellic%20Audit%20Repo...
1•delamon•30m ago•1 comments