Are there any open weight models that do? Not talking about speech to text -> LLM -> text to speech btw I mean a real voice <-> language model.
edit:
It does support real-time conversation! Has anybody here gotten that to work on local hardware? I'm particularly curious if anybody has run it with a non-nvidia setup.
Especially in the fruit pricing portion of the video for this model. Sounds completely normal but I can immediately tell it is ai. Maybe it's intonation or the overly stable rate of speech?
Maybe that's a good thing?
I think ChatGPT has the most lifelike speech with their voice models. They seem to have invested heavily in that area while other labs focused elsewhere.
On the video itself: Interesting, but "ideal" was pronounced wrong in German. For a promotional video, they should have checked that with native speakers. On the other hand its at least honest.
Not their fault frontier labs are letting their speech to speech offerings languish.
You can expect this model to have similar performance to the non-omni version. [2]
There aren't many open-weights omni models so I consider this a big deal. I would use this model to replace the keyboard and monitor in an application while doing the heavy lifting with other tech behind the scenes. There is also a reasoning version, which might be a bit amusing in an interactive voice chat if it pronounces the thinking tokens while working through to a final answer.
1. https://huggingface.co/Qwen/Qwen2.5-Omni-7B
2. https://artificialanalysis.ai/models/qwen3-30b-a3b-instruct
I'm curious how anyone has solved this
Qwen usually provides example code in Python that requires Cuda and a non-quantized model. I wonder if there is by now a good open source project to support this use case?
dvh•1h ago
iFire•1h ago
Weird, as someone not having a database of the web, I wouldn't be able to calculate either result.
iFire•1h ago
dvh•1h ago
kaoD•42m ago
And that's how I know you're not an LLM!
esafak•1h ago
parineum•1h ago
OP provided a we link with the answer, aren't these models supposed to be trained on all of that data?
esafak•1h ago
The model has a certain capacity -- quite limited in this case -- so there is an opportunity cost in learning one thing over another. That's why it is important to train on quality data; things you can build on top of.
DennisP•15m ago
brookst•1h ago