Show HN: Native-devtools-MCP – MCP server for native desktop app interaction

https://github.com/sh3ll3x3c/native-devtools-mcp

2•sh3ll3x3c•1w ago

Comments

sh3ll3x3c•1w ago

Hi HN, I built *native-devtools-mcp*, a Model Context Protocol (MCP) server for interacting with native desktop applications UIs. Right now it supports MacOS and Windows, but I intend on adding more platforms in the future.

Motivation: Most MCP servers today target specific environments (the Chrome DevTools MCP server for browser automation is a good example) but there’s no general MCP bridge for native desktop GUIs. native-devtools-mcp gives AI agents the ability to:

- capture screenshots and extract text (OCR) from the screen - simulate user input (mouse clicks, typing, scrolling) with hight precision by using OS local OCR - manage windows and focus - optionally connect to deeper UI trees for instrumented apps

It runs locally, does not upload any data externally (except for the LLM integration), and supports both macOS and Windows for now. The goal is to enable AI-driven workflows for GUI testing, automation, and desktop tool interaction.

Tech stack/highlights: - MCP JSON-RPC interface for tool clients - Visual feedback (images + OCR) plus input simulation - Dual interaction modes: universal visual relying on OCR/screenshots + debug-kit structural where available (MacOS)

Limitations / roadmap: - Early stage; improvements to accuracy and reliability planned - Expanding deeper support for more app platforms (Android is next!) - Integration with more AI tools - right now it's tested with Claude Code and Claude Desktop (and Cowork); it should work with other AI platforms too, but I haven't had the time to test it yet... - Better documentation and tooling around agent integration

Feedback I’m looking for: - Practical use cases where this changed your automation or testing workflow - Ideas to make MCP server integration with existing AI agent stacks easier

Happy to answer questions.

Goal: Ship 1M Lines of Code Daily

Show HN: Codex-mem, 90% fewer tokens for Codex

FastLangML: FastLangML:Context‑aware lang detector for short conversational text

LineageOS 23.2

Crypto Deposit Frauds

Substack makes money from hosting Nazi newsletters

Framing an LLM as a safety researcher changes its language, not its judgement

Are there anyone interested about a creator economy startup

Show HN: Skill Lab – CLI tool for testing and quality scoring agent skills

2003: What is Google's Ultimate Goal? [video]

Roger Ebert Reviews "The Shawshank Redemption"

Busy Months in KDE Linux

Zram as Swap

Green’s Dictionary of Slang - Five hundred years of the vulgar tongue

Nvidia CEO Says AI Capital Spending Is Appropriate, Sustainable

Show HN: StyloShare – privacy-first anonymous file sharing with zero sign-up

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

Show HN: Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

The Highest Exam: How the Gaokao Shapes China

Open-source framework for tracking prediction accuracy

India's Sarvan AI LLM launches Indic-language focused models

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

CReact Version 0.3.0 Released

Show HN: CReact – AI Powered AWS Website Generator

The rocky 1960s origins of online dating (2025)

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

Why there is no official statement from Substack about the data leak

Effects of Zepbound on Stool Quality

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator