frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Localvoxtral – Local real-time dictation on macOS with streaming STT

https://github.com/T0mSIlver/localvoxtral
1•T0mSIlver•2h ago
I built a native macOS menu bar app for real-time dictation that can run fully on-device.

Most dictation tools, even local ones, use Whisper or similar offline models: you record, then wait for the transcript. Localvoxtral uses Mistral's Voxtral Realtime, one of the first open-source speech models with a natively streaming architecture. Words appear as you speak, not after you stop. It feels closer to someone typing along as you talk.

Press a shortcut, speak, and text gets typed directly into whatever app you're in. No cloud, no subscription, no data leaving your machine.

Two backend options:

voxmlx on Apple Silicon: I forked voxmlx to add a WebSocket server and memory optimizations. Runs a 4-bit quantized model on an M1 Pro. Audio and inference stay fully on-device. vLLM on NVIDIA GPU: tested on an RTX 3090, noticeably faster.

The app is native Swift (~97%), lives in the menu bar, and stays out of your way. Configurable shortcut, mic selection, auto-paste. GitHub: https://github.com/T0mSIlver/localvoxtral

Pre-built DMG available in Releases

Comments

T0mSIlver•2h ago
Some technical context and where this is headed.

Why streaming matters for dictation. Whisper and most open-source STT models use bidirectional attention, meaning they need the full audio clip before they can transcribe anything. You get your text after you stop talking, usually with a noticeable delay. Voxtral Realtime takes a different approach: it has a causal audio encoder that processes audio left-to-right as it arrives. At 480ms delay it matches offline models on accuracy (FLEURS benchmark), but you see text appearing while you're still mid-sentence. For dictation this changes a lot. You can catch mistakes in real time, and the feedback loop feels natural instead of disconnected.

The app connects to backends via the OpenAI Realtime API WebSocket protocol. It captures audio from your mic, streams it over the WebSocket, and receives partial transcripts that get inserted into your active text field live. Any OpenAI Realtime-compatible server works.

The voxmlx fork. The original voxmlx by Awni Hannun does local Voxtral inference on Apple Silicon via MLX, but it was CLI-only. I added a WebSocket server that speaks the OpenAI Realtime protocol so localvoxtral (or any compatible client) can connect to it. I also added memory management to avoid OOM on longer sessions. Fork is here: https://github.com/T0mSIlver/voxmlx. I'd like to get the server piece upstreamed eventually.

Latency. On M1 Pro with a 4-bit quantized model, first words appear within roughly 200 to 400ms. On RTX 3090 via vLLM it's faster. Both feel responsive enough for natural dictation. What's next. Right now you have to start the server yourself before using the app. I want to add app-managed local serving (start/stop/model download) so it's truly one-click. If anyone has experience bundling Python/MLX processes into macOS apps cleanly, I'd love to hear your approach.

Happy to answer questions.

Researchers build ultra-efficient optical sensors shrinking light to a chip

https://www.colorado.edu/ecee/researchers-build-ultra-efficient-optical-sensors-shrinking-light-chip
1•giuliomagnifico•59s ago•0 comments

Builders Unscripted: Ep. 1 – Peter Steinberger, Creator of OpenClaw

https://www.youtube.com/watch?v=9jgcT0Fqt7U
1•doppp•1m ago•0 comments

Homeownership Is Out of Reach for Many Americans, Despite a Buyer's Market

https://www.nytimes.com/2026/02/23/business/home-buying-market-real-estate-economy.html
1•mooreds•1m ago•0 comments

Show HN: SQL Crack – Local-first SQL visualizer with column lineage

https://github.com/buva7687/sql-crack
1•buva•1m ago•1 comments

Nimble gets $75M to build web datasets for AI agents

https://twitter.com/nimble_data/status/2026288589735403716
1•blef•1m ago•0 comments

Time to Move On – The Reason Relationships End

https://steveblank.com/2026/02/24/time-to-move-on-the-reason-relationships-end/
1•MindGods•2m ago•0 comments

The Day Moltbook's Agents Started Doing SEO

https://growtika.com/blog/the-day-moltbooks-agents-started-doing-seo
1•Growtika•2m ago•0 comments

Be Careful with LLM "Agents"

https://maurycyz.com/misc/sandbox_llms/
1•speckx•3m ago•0 comments

Nobody Wants to Use Your Software (and That's the Point)

https://www.runproper.com/blog/nobody-wants-to-use-your-software
1•rsanaie•4m ago•0 comments

The Agent Times: OpenHands hits 68K stars in the agent economy

https://theagenttimes.com/articles/68107-stars-is-openhands-the-rocket-fuel-the-agent-economy-needs
1•Ross00781•5m ago•0 comments

Cardiorespiratory fitness is associated with lower anger and anxiety

https://linkinghub.elsevier.com/retrieve/pii/S000169182600171X
2•PaulHoule•6m ago•1 comments

Free Font: Times New Resistance

https://www.abbyhaddican.com/times-new-resistance
3•AlexandrB•6m ago•0 comments

EU: ECR rapporteur Wiśniewska is fighting to EXTEND scanning of private messages

https://digitalcourage.social/@echo_pbreyer/116119256928189485
1•nickslaughter02•6m ago•0 comments

Show HN: If Discord, Reddit, X, IRC and 4chan had a baby

2•ignasheahy•6m ago•0 comments

Replacing Anthropic's API with 2x 3090s. Claude Code on a local 80B Qwen model

https://twitter.com/sudoingX/status/2026297110141018122
1•ianlpaterson•6m ago•0 comments

Japan Pushes to Make Snowball Fighting an Olympic Event

https://www.chosun.com/english/sports-en/2026/02/24/H67UMP7OSNE7NOB6XR2JX4W7KY/
1•woldemariam•7m ago•0 comments

Show HN: Digital Janitor – A 1-click Python script to auto-sort messy downloads

https://github.com/Radhesh20/digital-janitor
1•radhesh20•7m ago•0 comments

Tell HN: GitHub Actions is falling over again

1•drcongo•8m ago•0 comments

Tethered – Runtime network egress control for Python

https://github.com/shcherbak-ai/tethered
1•sergiishcherbak•8m ago•1 comments

The New Panopticon: How AI Changes Accountability

https://florinandrei.substack.com/p/the-new-panopticon-how-ai-changes
1•Florin_Andrei•9m ago•1 comments

Racket 9.1 Is Available

https://blog.racket-lang.org/2026/02/racket-v9-1.html
2•owl_vision•9m ago•0 comments

Bulgarian Teacher with 38 International Medalist Students

https://www.youtube.com/watch?v=Zn0ZVxHGFC0
1•dzink•10m ago•0 comments

USRP X420 10MHz – 20 GHz SDR

https://www.ni.com/en-us/shop/model/ettus-usrp-x420.html
1•fadedsignal•10m ago•0 comments

Is AI Good for Democracy?

https://www.schneier.com/blog/archives/2026/02/is-ai-good-for-democracy.html
1•speckx•10m ago•0 comments

Show HN: Open-source LLM and dataset for sports forecasting (Pro Golf)

https://huggingface.co/LightningRodLabs/Golf-Forecaster
5•bturtel•10m ago•0 comments

PersonaLive Expressive Portrait Image Animation for Live Streaming

https://arxiv.org/abs/2512.11253
1•tamnd•11m ago•0 comments

People Are Worried About Blue Owl Liquidity

https://www.bloomberg.com/opinion/newsletters/2026-02-23/people-are-worried-about-blue-owl-liquidity
1•mooreds•11m ago•1 comments

The Epstein Files Should Never Have Been Released

https://www.nytimes.com/2026/02/23/opinion/epstein-files-justice-department.html
3•Anon84•13m ago•0 comments

Show HN: Ghist – Task management that lives in your repo

https://github.com/unnecessary-special-projects/ghist
3•nxnze•14m ago•0 comments

Elektrobit and Mobileye partner on safety Linux for L4 autonomy

https://www.just-auto.com/news/elektrobit-and-mobileye-collaborate-on-safety-linux-for-level-4-au...
1•losgehts•14m ago•0 comments