So I built OmniGlass.
The UX is simple: You draw a box on your screen, local OCR extracts the text, and an LLM classifies what you're looking at. But instead of generating a chat response, it gives you an action menu.
The core difference from Claude Desktop isn't the AI—it’s what happens after the AI thinks. Claude reads your screen and writes you a paragraph. OmniGlass reads your screen and runs the command.
What it does today:
Snip a traceback → Generates the fix command, you confirm, it runs.
Snip a data table → Opens a native save dialog and spits out a clean CSV.
Snip a Slack bug report → Drafts a GitHub issue with all the context filled in.
Menu bar input → Type plain English, and it triggers the appropriate command.
The security elephant in the room (Why I built this): Nobody is really talking about the security risks of MCP plugins yet. Claude Desktop runs them with your full user permissions. A rogue plugin—or a clever prompt injection—can just read your SSH keys, scrape your .env files, and ship them off.
To fix this, OmniGlass sandboxes every plugin at the macOS kernel level using sandbox-exec. Your /Users/ directory is completely walled off. Environment variables are aggressively filtered. Shell commands strictly require your manual confirmation before executing. I wanted to be able to run community plugins without sweating about what they can access.
The Stack:
Frontend/Backend: Tauri (Rust + TypeScript)
Vision: Apple Vision OCR (local)
Plugin System: MCP over stdio
Models: Works with Claude Haiku, Gemini Flash, or fully local via llama.cpp using Qwen-2.5 (takes ~6s end-to-end, nothing leaves your machine).
Current Status: I just shipped our second working plugin (a Slack Webhook) to run alongside the GitHub Issues plugin. It's two real-world plugins proving the architecture actually works, not just a boilerplate template and a promise. Both are under 250 lines of code.
Where I'd love your help:
Break the sandbox. Seriously. If you can figure out a way to read ~/.ssh/id_rsa from a plugin, that is a critical bug and I want to know about it.
Build a plugin. There are 8 open issues in the repo right now with full MCP schemas, manifests, and implementation hints. Most take less than 100 lines.
Port to Windows/Linux. The Windows build compiles in CI but hasn't been tested on real metal. Linux needs Tesseract + Bubblewrap to replace the Apple OCR and sandbox.
Requires macOS 12+ right now. Fully open source (MIT).
Would love to hear your thoughts or answer any questions about the sandboxing setup!