Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

https://github.com/moonshine-ai/moonshine

61•petewarden•2h ago

I wanted to share our new speech to text model, and the library to use them effectively. We're a small startup (six people, sub-$100k monthly GPU budget) so I'm proud of the work the team has done to create streaming STT models with lower word-error rates than OpenAI's largest Whisper model. Admittedly Large v3 is a couple of years old, but we're near the top the HF OpenASR leaderboard, even up against Nvidia's Parakeet family. Anyway, I'd love to get feedback on the models and software, and hear about what people might build with it.

Comments

cyanydeez•1h ago

No LICENSE no go

bangaladore•1h ago

There is a license blurb in the readme.

> This code, apart from the source in core/third-party, is licensed under the MIT License, see LICENSE in this repository.

> The English-language models are also released under the MIT License. Models for other languages are released under the Moonshine Community License, which is a non-commercial license.

> The code in core/third-party is licensed according to the terms of the open source projects it originates from, with details in a LICENSE file in each subfolder.

altruios•1h ago

reading through readme.md "License This code, apart from the source in core/third-party, is licensed under the MIT License, see LICENSE in this repository.

The English-language models are also released under the MIT License. Models for other languages are released under the Moonshine Community License, which is a non-commercial license.

The code in core/third-party is licensed according to the terms of the open source projects it originates from, with details in a LICENSE file in each subfolder."

lostmsu•1h ago

How does it compare to Microsoft VibeVoice ASR https://news.ycombinator.com/item?id=46732776 ?

armcat•47m ago

This is awesome, well done guys, I’m gonna try it as my ASR component on the local voice assistant I’ve been building https://github.com/acatovic/ova. The tiny streaming latencies you show look insane

ac29•45m ago

No idea why 'sudo pip install --break-system-packages moonshine-voice' is the recommended way to install on raspi?

The authors do acknowledge this though and give a slightly too complex way to do this with uv in an example project (FYI, you dont need to source anything if you use uv run)

g-mork•41m ago

How does this compare to Parakeet, which runs wonderfully on CPU?

pzo•30m ago

haven't tested yet but I'm wondering how it will behave when talking about many IT jargon and tech acronyms. For those reason I had to mostly run LLM after STT but that was slowing done parakeet inference. Otherwise had problems to detect properly sometimes when talking about e.g. about CoreML, int8, fp16, half float, ARKit, AVFoundation, ONNX etc.

sroussey•24m ago

onnx models for browser possible?

asqueella•20m ago

For those wondering about the language support, currently English, Arabic, Japanese, Korean, Mandarin, Spanish, Ukrainian, Vietnamese are available (most in Base size = 58M params)

Karrot_Kream•15m ago

According to the OpenASR Leaderboard [1], looks like Parakeet V2/V3 and Canary-Qwen (a Qwen finetune) handily beat Moonshine. All 3 models are open, but Parakeet is the smallest of the 3. I use Parakeet V3 with Handy and it works great locally for me.

[1]: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard

I'm helping my dog vibe code games

Mac mini will be made at a new facility in Houston

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Hacking an old Kindle to display bus arrival times

Nearby Glasses

Cell Service for the Fairly Paranoid

Show HN: Emdash – Open-source agentic development environment

I pitched a roller coaster to Disneyland at age 10 in 1978

Hugging Face Skills

Optophone

How we rebuilt Next.js with AI in one week

Fed's Cook says AI triggering big changes, sees possible unemployment rise

Pi – a minimal terminal coding harness

Build Your Own Forth Interpreter

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

OpenAI, the US government and Persona built an identity surveillance machine

We installed a single turnstile to feel secure

The history of knocking on wood

Steel Bank Common Lisp

Verge (YC S15) Is Hiring a Director of Computational Biology and AI Scientists/Eng

Mercury 2: The fastest reasoning LLM, powered by diffusion

Looks like it is happening

Dream Recorder AI – a portal to your subconscious

Ask HN: Programmable Watches with WiFi?

We Are Changing Our Developer Productivity Experiment Design

Stripe reportedly makes offer to acquire PayPal

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

Show HN: Tag Promptless on any GitHub PR/Issue to get updated user-facing docs

Show HN: Chaos Monkey but for Audio Video Testing (WebRTC and UDP)

The Missing Semester of Your CS Education – Revised for 2026

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Comments

I'm helping my dog vibe code games

Mac mini will be made at a new facility in Houston

Show HN: Moonshine Open-Weights STT models – higher accuracy than WhisperLargev3

Hacking an old Kindle to display bus arrival times

Nearby Glasses

Cell Service for the Fairly Paranoid

Show HN: Emdash – Open-source agentic development environment

I pitched a roller coaster to Disneyland at age 10 in 1978

Hugging Face Skills

Optophone

How we rebuilt Next.js with AI in one week

Fed's Cook says AI triggering big changes, sees possible unemployment rise

Pi – a minimal terminal coding harness

Build Your Own Forth Interpreter

IRS Tactics Against Meta Open a New Front in the Corporate Tax Fight

OpenAI, the US government and Persona built an identity surveillance machine

We installed a single turnstile to feel secure

The history of knocking on wood

Steel Bank Common Lisp

Verge (YC S15) Is Hiring a Director of Computational Biology and AI Scientists/Eng

Mercury 2: The fastest reasoning LLM, powered by diffusion

Looks like it is happening

Dream Recorder AI – a portal to your subconscious

Ask HN: Programmable Watches with WiFi?

We Are Changing Our Developer Productivity Experiment Design

Stripe reportedly makes offer to acquire PayPal

IDF killed Gaza aid workers at point blank range in 2025 massacre: Report

Show HN: Tag Promptless on any GitHub PR/Issue to get updated user-facing docs

Show HN: Chaos Monkey but for Audio Video Testing (WebRTC and UDP)

The Missing Semester of Your CS Education – Revised for 2026