They claim to undercut competitors of similar quality by half for both models, yet they released both as Apache 2.0 instead of following smaller - open, larger - closed strategy used for their last releases.
What's different here?
Havoc•10h ago
Probably not looking to directly compete in transcription space
wmf•10h ago
They're working on a bunch of features so maybe those will be closed. I guess they're feeling generous on the base model.
halJordan•10h ago
They didn't release voxtral large so your question doesn't really make sense
danelski•2h ago
It's about what their top offering is at the moment, not having Large in name. Mistral Medium 3 is notably not Mistral Large 3, but it was released as API-only.
No. I found models doing that unreliable when there are many speakers.
4b11b4•5h ago
This is your service?
lostmsu•13h ago
Does it support realtime transcription? What is the ~latency?
ipsum2•10h ago
24B is crazy expensive for speech transcription. Conspicuously no comparison with Parakeet, a 600M param model thats currently dominating leaderboards (but only for English)
azinman2•3h ago
But it also includes world knowledge, can do tool calls, etc. It’s an omnimodel
sheerun•6h ago
In demo they mention polish prononcuation is pretty bad, spoken as if second language of english-native speaker. I wonder if it's the same for other languages. On the other hand whispering-english is hillariously good, especially different emotions.
Raed667•3h ago
It is insane how good the "French man speaking English" demo is. It captures a lot of subtleties
kamranjon•5h ago
Im pretty excited to play around with this. I’ve worked with whisper quite a bit, it’s awesome to have another model in the same class and from Mistral, who tend to be very open. I’m sure unsloth is already working on some GGUF quants - will probably spin it up tomorrow and try it on some audio.
danelski•17h ago
Havoc•10h ago
wmf•10h ago
halJordan•10h ago
danelski•2h ago