Show HN: Make your own voice AI in two clicks

27•unmute-sh•9mo ago

Upload a voice and write a personality prompt, or try the pre-made characters.

Built by augmenting Gemma 3 12B with our new text-to-speech and speech-to-text models, both of which we will release as open-source soon. Stay tuned.

Comments

lightbulbish•8mo ago

I thought this was fantastic! Surprised not more people are commenting on this. Is there a reason I am not aware of?

To the author: what happens to my voice after I upload it? What is your plan moving forward? I am too far left field to understand how to build a business and monetize an open source product like this, even though I found it fun to play around with.

unmute-sh•8mo ago

Thanks! There is a model that turns the voice into an embedding that is used to determine the voice. Unlike the STT and TTS, we won't be releasing the weights of this voice cloning model, but we will provide it over an API so that we can do verification and prevent abuse.

edit: Ah yes, and we do not store the voice sample on our server. The voice embedding is cached for 24 hours.

ton4eg•8mo ago

Way more entertaining than I would expect! What TTS and ASR models do you use? What sort of latency do you get?

unmute-sh•8mo ago

Thank you! The TTS and ASR are our own unreleased models, but we'll open-source them soon :)

The latency is about 500ms once we detect that it's the bot's turn to speak (roughly 200ms for the LLM's time-to-first token and 300ms for the TTS audio to start), plus a variable time for the semantic pause detection (VAD).

If it's clear that you're done talking, like when you ask a question, the model will reply very fast. If you stop mid-sentence as if you have more to say, it will wait for longer to avoid interrupting you.

karim79•8mo ago

Incredible work. Short, sweet and simple. I hadn't expected to enjoy this as much as I did. I can't wait to see where it goes.

android521•8mo ago

can't wait for the open source release.

xingwu•8mo ago

Simple, functional, perfect.

marnesh•8mo ago

Wow this is actually pretty amazing, It is so natural

marnesh•8mo ago

This is really amazing, it is so natural

Could ionospheric disturbances influence earthquakes?

SpaceX's next astronaut launch for NASA is officially on for Feb. 11 as FAA clea

Show HN: One-click AI employee with its own cloud desktop

Show HN: Poddley – Search podcasts by who's speaking

Same Surface, Different Weight

The Rise of Spec Driven Development

The first good Raspberry Pi Laptop

Seas to Rise Around the World – But Not in Greenland

Will Future Generations Think We're Gross?

State Department will delete Xitter posts from before Trump returned to office

Show HN: Verifiable server roundtrip demo for a decision interruption system

Impl Rust – Avro IDL Tool in Rust via Antlr

Stories from 25 Years of Software Development

minikeyvalue

Neomacs: GPU-accelerated Emacs with inline video, WebKit, and terminal via wgpu

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

How I grow my X presence?

What's the cost of the most expensive Super Bowl ad slot?

What if you just did a startup instead?

Hacking up your own shell completion (2020)

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

GLM-OCR: Accurate × Fast × Comprehensive

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

Show HN: AboutMyProject – A public log for developer proof-of-work

Expertise, AI and Work of Future [video]

So Long to Cheap Books You Could Fit in Your Pocket

PID Controller

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

Kubernetes MCP Server

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife