frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR

https://www.tavus.io/post/sparrow-1-human-level-conversational-timing-in-real-time-voice
38•code_brian•12h ago
For the past year I've been working to rethink how AI manages timing in conversation at Tavus. I've spent a lot of time listening to conversations. Today we're announcing the release of Sparrow-1, the most advanced conversational flow model in the world.

Some technical details:

- Predicts conversational floor ownership, not speech endpoints

- Audio-native streaming model, no ASR dependency

- Human-timed responses without silence-based delays

- Zero interruptions at sub-100ms median latency

- In benchmarks Sparrow-1 beats all existing models at real world turn-taking baselines

I wrote more about the work here: https://www.tavus.io/post/sparrow-1-human-level-conversation...

Comments

orliesaurus•1h ago
Literally no way to sign up to try. Put my email and password and it puts me into some wait list despite the video saying I could try the model today. That's what makes me mad about these kind of releases is that the marketing and the product don't talk together.
nubg•1h ago
Any examples available? Sounds amazing.
nextaccountic•26m ago
> Non-verbal cues are invisible to text: Transcription-based models discard sighs, throat-clearing, hesitation sounds, and other non-verbal vocalizations that carry critical conversational-flow information. Sparrow-1 hears what ASR ignores.

Could Sparrow instead be used to produce high quality transcription that incorporate non-verbal cues?

Or even, use Sparrow AND another existing transcription/ASR thing to augment the transcription with non-verbal cues

randyburden•22m ago
Awesome. We've been using Sparrow-0 in our platform since launch, and I'm excited to move to Sparrow-1 over the next few days. Our training and interview pre-screening products rely heavily on Tavus's AI avatars, and this upgrade (based on the video in your blog post) looks like it addresses some real pain points we've run into. Really nice work.
dfajgljsldkjag•18m ago
I am always skeptical of benchmarks that show perfect scores, especially when they come from the company selling the product. It feels like everyone claims to have solved conversational timing these days. I guess we will see if it is actually any good.
fudged71•9m ago
Different industry, but our marketing guy once said "You know what this [perfect] metric means? We can never use it in marketing because it's not believable"
cuuupid•13m ago
The first time I met Tavus, their engineers (incl Brian!) were perfectly willing to sit down and build their own better Infiniband to get more juice out of H100s. There is pretty much nobody working on latency and realtime at the level they are, Sparrow-1 would be an defining achievement for most startups but will just be one of dozens for Tavus :)
ttul•2m ago
I tried talking to Claude today. What a nightmare. It constantly interrupts you. I don’t mind if Claude wants to spend ten seconds thinking about its reply, but at least let ME finish my thought. Without decent turn-taking, the AI seems impolite and it’s just an icky experience. I hope tech like this gets widely distributed soon because there are so many situations in which I would love to talk with a model. If only it worked.

The URL shortener that makes your links look as suspicious as possible

https://creepylink.com/
179•dreadsword•3h ago•33 comments

Claude Cowork exfiltrates files

https://www.promptarmor.com/resources/claude-cowork-exfiltrates-files
581•takira•10h ago•246 comments

Furiosa: 3.5x efficiency over H100s

https://furiosa.ai/blog/introducing-rngd-server-efficient-ai-inference-at-data-center-scale
130•written-beyond•5h ago•67 comments

Show HN: Sparrow-1 – Audio-native model for human-level turn-taking without ASR

https://www.tavus.io/post/sparrow-1-human-level-conversational-timing-in-real-time-voice
38•code_brian•12h ago•8 comments

Ask HN: What did you find out or explore today?

51•blahaj•12h ago•36 comments

Scaling long-running autonomous coding

https://cursor.com/blog/scaling-agents
169•samwillis•8h ago•82 comments

Ask HN: Share your personal website

521•susam•13h ago•1511 comments

Project SkyWatch (a.k.a. Wescam at Home)

https://ianservin.com/2026/01/13/project-skywatch-aka-wescam-at-home/
13•jjwiseman•13h ago•3 comments

New Safari developer tools provide insight into CSS Grid Lanes

https://webkit.org/blog/17746/new-safari-developer-tools-provide-insight-into-css-grid-lanes/
20•feross•5h ago•4 comments

Ask HN: How are you doing RAG locally?

69•tmaly•15h ago•21 comments

The State of OpenSSL for pyca/cryptography

https://cryptography.io/en/latest/statements/state-of-openssl/
117•SGran•8h ago•19 comments

Bubblewrap: A nimble way to prevent agents from accessing your .env files

https://patrickmccanna.net/a-better-way-to-limit-claude-code-and-other-coding-agents-access-to-se...
61•0o_MrPatrick_o0•4h ago•49 comments

Ask HN: Weird archive.today behavior?

71•rabinovich•8h ago•18 comments

Ask HN: What is the best way to provide continuous context to models?

33•nemath•5h ago•15 comments

Show HN: Ever wanted to look at yourself in Braille?

https://github.com/NishantJoshi00/dith
20•cat-whisperer•5d ago•10 comments

Show HN: WebTiles – create a tiny 250x250 website with neighbors around you

https://webtiles.kicya.net/
152•dimden•5d ago•23 comments

Show HN: Webctl – Browser automation for agents based on CLI instead of MCP

https://github.com/cosinusalpha/webctl
79•cosinusalpha•15h ago•26 comments

SparkFun Officially Dropping AdaFruit due to CoC Violation

https://www.sparkfun.com/official-response
430•yaleman•15h ago•431 comments

Handy – free open source speech-to-text app

https://github.com/cjpais/Handy
3•tin7in•1h ago•0 comments

Sun Position Calculator

https://drajmarsh.bitbucket.io/earthsun.html
90•sanbor•9h ago•19 comments

Find a pub that needs you

https://www.ismypubfucked.com/
250•thinkingemote•14h ago•197 comments

ChromaDB Explorer

https://www.chroma-explorer.com/
48•arsentjev•8h ago•3 comments

Generate QR Codes with Pure SQL in PostgreSQL

https://tanelpoder.com/posts/generate-qr-code-with-pure-sql-in-postgres/
69•tanelpoder•4d ago•6 comments

Crafting Interpreters

https://craftinginterpreters.com/
60•tosh•8h ago•8 comments

How can I build a simple pulse generator to demonstrate transmission lines

https://electronics.stackexchange.com/questions/764155/how-can-i-build-a-simple-pulse-generator-t...
31•alphabetter•5d ago•7 comments

Roam 50GB is now Roam 100GB

https://starlink.com/support/article/58c9c8b7-474e-246f-7e3c-06db3221d34d
268•bahmboo•14h ago•315 comments

Is Rust faster than C?

https://steveklabnik.com/writing/is-rust-faster-than-c/
255•vincentchau•4d ago•283 comments

Ford F-150 Lightning outsold the Cybertruck and was then canceled for poor sales

https://electrek.co/2026/01/13/ford-f150-lightning-outsold-tesla-cybertruck-canceled-not-selling-...
552•MBCook•13h ago•720 comments

Rubik's Cube in Prolog – Order

https://medium.com/@kenichisasagawa/i-am-preparing-material-for-a-prolog-book-af7580acfee7
29•myth_drannon•4d ago•8 comments

Native ZFS VDEV for Object Storage (OpenZFS Summit)

https://www.zettalane.com/blog/openzfs-summit-2025-mayanas-objbacker.html
101•suprasam•11h ago•29 comments