frontpage.

Mobile-MCP: Letting LLMs autonomously discover Android app capabilities

1•weloverust•1h ago

Hi all,

We’ve been thinking about a core limitation in current mobile AI assistants:

Most systems (e.g., Apple Intelligence, Google Assistant–style integrations) rely on predefined schemas and coordinated APIs. Apps must explicitly implement the assistant’s specification. This limits extensibility and makes the ecosystem tightly controlled.

On the other hand, GUI-based agents (e.g., AppAgent, AutoDroid, droidrun) rely on screenshots + accessibility, which gives broad power but weak capability boundaries.

So we built Mobile-MCP, an Android-native realization of the Model Context Protocol (MCP) using the Intent framework.

The key idea:

- Apps declare MCP-style capabilities (with natural-language descriptions) in their manifest.

- An LLM-based assistant can autonomously discover all exposed capabilities on-device via the PackageManager.

- The LLM selects which API to call and generates parameters based on natural language description.

- Invocation happens through standard Android service binding / Intents.

Unlike Apple/Android-style coordinated integrations:

- No predefined action domains.

- No centralized schema per assistant.

- No per-assistant custom integration required.

- Tools can be dynamically added and evolve independently.

The assistant doesn’t need prior knowledge of specific apps — it discovers and reasons over capabilities at runtime.

We’ve built a working prototype + released the spec and demo:

GitHub: https://github.com/system-pclub/mobile-mcp

Spec: https://github.com/system-pclub/mobile-mcp/blob/main/spec/mobile-mcp_spec_v1.md

Demo: https://www.youtube.com/watch?v=Bc2LG3sR1NY&feature=youtu.be

Paper: https://github.com/system-pclub/mobile-mcp/blob/main/paper/mobile_mcp.pdf

Curious what people think:

Is OS-native capability broadcasting + LLM reasoning a more scalable path than fixed assistant schemas or GUI automation?

Would love feedback from folks working on mobile agents, security, MCP tooling, or Android system design.

The Kind of Joke No One Makes Anymore

Built a Hacker News Client for Terminal

" AI infrastructure is controlled by companies making toilets, MSG, and glass"

Claude Code Skills and 380 agent skills from official dev teams and community

Eth.zig: The fastest Ethereum library. Pure Zig. Zero dependencies

AI agents are fast, loose, and out of control, MIT study finds (ZDNET)

AI Agent Crypto Wallets Create New Legal Risks, Investors Warn

Get free Claude max 20x for open-source maintainers

Working on Pharo Smalltalk: BPatterns: Rewrite Engine with Smalltalk Style

Show HN: RunVeto – A Simple Kill Switch for Autonomous AI Agents

The LLM App Isn't a Model, It's a System: Designing for Quarterly Model Swaps

An Unbiased OSS Benchmark. For Code Review Agents

The 'Million AI Monkeys' Hypothesis and Real-World Projects

Merrilin – We built an app to read books

Software development now costs less than than the wage of a minimum wage worker

My personal blog's traffic is 95% AI crawlers this week

Does journaling help people understand themselves long-term?

Template for telling the truth on layoffs like at Block

Reddit disables any access to R/all on mobile

HikmaAI – The AI Agent Supply Chain is Broken. Here is how we fix it

How to Run Services on a Linux Server

AI voice agents for hotels: lessons from 15,910 real guest calls

Show HN: I Created an Interactive Resume Space Invader Game

Show HN: Asupersync, the Cancel-Correct Async Runtime for Rust

Atomic GraphRAG Explained: The Case for a Single-Query Pipeline

The Teaching Method That Can't Fail

AI Assisted Coding

Apple announces F1 details, and a surprising Netflix partnership

Addressing AI-slop in security reports

I swarms can threaten democracy by manufacturing fake public consensus