So I built a macOS menu bar app that translates right where you're working.
Two modes:
- Select text in any app → Cmd+Option+Z → instant translation
- Cmd+Shift+T → drag over any area → OCR + translate (images, PDFs, subtitles)
Everything runs on-device via Apple Vision + Apple Translation. No servers, no tracking. Free
forever.
20 languages · offline capable · GPL-3.0
vunderba•3h ago
I created something like this over a decade ago for Windows that would let you hit a globally registered shortcut to hover a magnifying glass over text in a windowed/fullscreen game - I used to use it while I was studying Chinese with emulated SNES RPGs.
Back then the best we could do was tesseract OCR feeding down to the open CC-CEDICT dictionary. It was primitive but sufficed!
hcmhcs0•3h ago
I went with Apple's Vision and Translation frameworks since they were the easiest path for me, but the downside is it requires macOS 15+. I'm thinking about adding Tesseract as an alternative OCR engine to support older versions — sounds like it could work well enough!
vunderba•3h ago
What I ended up doing was generating around a dozen versions of a screenshot in realtime (all with different combinations of thresholding, segmentation parameters, resolution scaling, and denoising) behind the scenes. Then it would fire Tesseract off on all of them in parallel threads and let them “vote” on the result.
After I set that up, the accuracy improved significantly.
If you're looking for an alternative rather than Tesseract - I'd actually recommend Surya. I've had a lot of success with it out of the box with doing OCR on comics.
https://github.com/datalab-to/surya
hcmhcs0•2h ago
Thanks for the Surya recommendation, I hadn't come across it before. Will definitely check it out!