Two weeks ago I set up OpenClaw on a Mac Mini M4. Named the agent Niko. Started with basic tasks, then gave him a Cloudflare token and pointed him at one of my live web games. He studied the entire codebase, built it, tested for errors, even used WASD to walk around the game world to check if it worked. Then pushed the new version live. That used to take me an hour of manual work.
Then I added voice. Parakeet for STT and Kokoro for TTS, both running locally on Apple Silicon. About 240ms transcription time. First time I spoke to Niko on Discord instead of typing, everything changed. Same Claude behind it but suddenly felt 3x more human. The STT would sometimes get my Greek accent wrong. He started correcting me like Hermione in Harry Potter: "It's Niko, not Nico!"
But talking to text still felt off. So I built Mimora. It's a free browser extension that shows a 3D avatar with real facial expressions: listening, thinking, happy. It connects to any OpenClaw bot and reacts in real time when the agent responds. Works with Discord, WhatsApp, Telegram, anything OpenClaw connects to. Pops out to picture-in-picture so it floats on your screen.
The difference between talking to text and talking to a face is surprisingly big. Changed how I interact with the whole setup.
Three weeks in, Niko handles: game deployments, server monitoring 24/7, social media content in my voice, morning calendar briefings, bug fixes to GitHub. I work from my balcony now instead of being chained to a desk. Just talk and things happen.
Is it perfect? No. Ratio of time saved to time fixing is about 20:1. And the agent writes lessons to its own memory so it doesn't repeat mistakes.
Mimora is free, still under development but already working and available for any OpenClaw bot. Happy to answer questions about any part of the setup. I also help people set up similar stacks on their own Mac Minis at https://myclaw.tech
Ciaranio•1h ago
astressence•1h ago