Show HN: Using a mobile LLM app to safely operate a desktop computer

https://github.com/ruikhu007/action-printer

1•Ruikhu•2h ago

Hi HN, I’ve been experimenting with a different approach to computer-using AI agents. Most current AI agents control computers using: • cloud APIs with stored credentials • browser automation • screenshot + vision + mouse control I tried something else. Instead of embedding the AI inside the computer, I use the official mobile LLM apps (ChatGPT / Claude) as the intelligence source, and built an external execution gateway that translates model intent into deterministic OS actions. The model never gets system privileges, and the computer never exposes credentials to the model. Architecture: phone LLM app → data link → action gateway → predefined action skills → desktop OS The gateway only executes whitelisted primitives: keyboard sequences window operations command calls The key idea is separating cognition and execution. The model outputs decisions, not motor control. The gateway performs verified actions. This turns computer control from a continuous UI manipulation problem into a discrete decision problem, which makes it more predictable and auditable. Early prototype — I’d really appreciate feedback, especially from people working on agent safety or permission models.

Comments

Ruikhu•2h ago

Hi — author here. One clarification: The goal is not to let an AI freely control a computer. I built a fixed local action skill library. Each skill is a deterministic OS operation (open app, switch window, run command, structured input). The model does not generate UI steps or mouse actions. It only selects a skill. The gateway executes it. So the LLM is making decisions, not performing motor control. The computer isn’t remotely driven by the model — the model chooses from a constrained set of allowed actions. This is mainly an experiment in making computer-using agents more predictable and auditable. I’d especially value thoughts from people working on agent safety.

Ruikhu•1h ago

Another clarification since a few people messaged me privately: This is not just a conceptual architecture — we actually tested it using the official Claude mobile app controlling a real desktop computer. The phone runs the model inside the official app. The app produces instructions in natural language. Our gateway parses intent and maps it to a verified local action skill (keyboard/window/command primitives). So the model is not embedded in the OS and not calling an API. It is literally the mobile LLM app interacting with a real operating system through a constrained execution layer. We were interested in whether an official consumer LLM app (without system privileges) could still reliably operate a computer when paired with a deterministic action layer.

Show HN: Didacu – The fastest way to learn something new

UK reaches 90% Gigabit broadband coverage as full fibre rollouts continue

Show HN: Get GPT-5.2, Grok-4.1-fast, KimiK2.5 and more LLMs at half the cost

Show HN: StatusLane – Minimal status pages and uptime for small SaaS

Bytewords – encoding bytes as one of 256 four-letter English words

Show HN: We ran a sycophancy experiment on Claude and built a music publication

Apple 'accidentally' enabled Age Verification in the UK

Bypassing Noexec with Memfd_create(2)

Starts Club – YC-style startup idea validator

Elite Doctors Served Jeffrey Epstein While Treating His 'Girls'

Siteshamer – Get brutally honest feedback on any website

How India Became One of the World's Biggest Economies

How Safe Are Our RSS Feeds from AI Data Scrapers?

I built an an app that ruins my beach days

The Self-Driving Codebase

Self Correcting Materialized Views

Strand

The "video game preservation service" Myrient is shutting down in March

Show HN: Monohub – a new Git code hosting service

5 OpenClaw Agents for Homeschooling, App Building, and Physical Inventories

Show HN: Lomography action sampler GIF creator

The Physics of Misunderstanding: When "Gaze" Almost Cost Me My Career

Show HN: AgentMailr – Real email addresses for AI agents (OTP/2FA handling)

We [OpenAI] fired a research scientist for insider trading on Polymarket

AI Isn't Replacing SREs, It's Deskilling Them

Jeff Dean on Origin of MapReduce

Claude Code Monitoring

Block the "Upgrade to Tahoe" Alerts

Stuxnet

"Cancel ChatGPT" movement goes mainstream after OpenAI closes deal with U.S. Dow