frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Multimodal perception system for real-time conversation

https://raven.tavuslabs.org
19•mert_gerdan•2h ago
I work on real-time voice/video AI at Tavus and for the past few years, I’ve mostly focused on how machines respond in a conversation.

One thing that’s always bothered me is that almost all conversational systems still reduce everything to transcripts, and throw away a ton of signals that need to be used downstream. Some existing emotion understanding models try to analyze and classify into small sets of arbitrary boxes, but they either aren’t fast / rich enough to do this with conviction in real-time.

So I built a multimodal perception system which gives us a way to encode visual and audio conversational signals and have them translated into natural language by aligning a small LLM on these signals, such that the agent can "see" and "hear" you, and that you can interface with it via an OpenAI compatible tool schema in a live conversation.

It outputs short natural language descriptions of what’s going on in the interaction - things like uncertainty building, sarcasm, disengagement, or even shift in attention of a single turn in a convo.

Some quick specs:

- Runs in real-time per conversation

- Processing at ~15fps video + overlapping audio alongside the conversation

- Handles nuanced emotions, whispers vs shouts

- Trained on synthetic + internal convo data

Happy to answer questions or go deeper on architecture/tradeoffs

More details here: https://www.tavus.io/post/raven-1-bringing-emotional-intelli...

Comments

jesserowe•2h ago
the demo is wild... kudos

The Singularity will occur on a Tuesday

https://campedersen.com/singularity
535•ecto•4h ago•303 comments

Ex-GitHub CEO launches a new developer platform for AI agents

https://entire.io/blog/hello-entire-world/
188•meetpateltech•5h ago•145 comments

Mathematicians disagree on the essential structure of the complex numbers

https://www.infinitelymore.xyz/p/complex-numbers-essential-structure
106•FillMaths•5h ago•122 comments

Simplifying Vulkan one subsystem at a time

https://www.khronos.org/blog/simplifying-vulkan-one-subsystem-at-a-time
178•amazari•8h ago•99 comments

My eighth year as a bootstrapped founder

https://mtlynch.io/bootstrapped-founder-year-8/
42•mtlynch•2d ago•14 comments

Clean-room implementation of Half-Life 2 on the Quake 1 engine

https://code.idtech.space/fn/hl2
282•klaussilveira•10h ago•55 comments

Show HN: Rowboat – AI coworker that turns your work into a knowledge graph (OSS)

https://github.com/rowboatlabs/rowboat
80•segmenta•4h ago•22 comments

Competition is not market validation

https://www.ablg.io/blog/competition-is-not-validation
29•tonioab•5h ago•5 comments

Show HN: Clawe – open-source Trello for agent teams

https://github.com/getclawe/clawe
40•Jonathanfishner•1h ago•28 comments

Show HN: Showboat and Rodney, so agents can demo what they've built

https://simonwillison.net/2026/Feb/10/showboat-and-rodney/
74•simonw•3h ago•39 comments

Qwen-Image-2.0: Professional infographics, exquisite photorealism

https://qwen.ai/blog?id=qwen-image-2.0
332•meetpateltech•12h ago•151 comments

The Evolution of Bengt Betjänt

https://andonlabs.com/blog/evolution-of-bengt
25•lukaspetersson•18h ago•2 comments

Launch HN: Livedocs (YC W22) – An AI-native notebook for data analysis

https://livedocs.com
36•arsalanb•3h ago•14 comments

The Little Learner: A Straight Line to Deep Learning

https://mitpress.mit.edu/9780262546379/the-little-learner/
8•AlexeyBrin•2d ago•0 comments

China's Data Center Boom: A View from Zhangjiakou (2025)

https://sinocities.substack.com/p/chinas-data-center-boom-a-view-from
18•fzliu•2h ago•7 comments

Markdown CLI viewer with VI keybindings

https://github.com/taf2/mdvi
35•taf2•3h ago•12 comments

Google handed ICE student journalist's bank and credit card numbers

https://theintercept.com/2026/02/10/google-ice-subpoena-student-journalist/
429•lehi•3h ago•155 comments

The switch to Linux and the beginning of my self-hosting journey

https://hazemkrimi.tech/blog/linux-self-hosting-journey/
87•kingcrimson1000•3h ago•61 comments

A brief history of oral peptides

https://seangeiger.substack.com/p/a-brief-history-of-oral-peptides
36•odedfalik•1d ago•10 comments

Show HN: Multimodal perception system for real-time conversation

https://raven.tavuslabs.org
19•mert_gerdan•2h ago•1 comments

Show HN: Stripe-no-webhooks – Sync your Stripe data to your Postgres DB

https://github.com/pretzelai/stripe-no-webhooks
23•prasoonds•4h ago•12 comments

Oxide raises $200M Series C

https://oxide.computer/blog/our-200m-series-c
448•igrunert•7h ago•227 comments

How did Windows 95 get permission to put the Weezer video Buddy Holly on the CD?

https://devblogs.microsoft.com/oldnewthing/20260210-00/?p=112052
10•ingve•2h ago•2 comments

Show HN: I built a macOS tool for network engineers – it's called NetViews

https://www.netviews.app
137•n1sni•16h ago•41 comments

Toyotas and Terrorists: "Why are ISIS's trucks better than ours?"

https://www.airuniversity.af.edu/Wild-Blue-Yonder/Articles/Article-Display/Article/3600155/toyota...
73•marysminefnuf•1h ago•80 comments

Show HN: I made paperboat.website, a platform for friends and creativity

https://paperboat.website/home/
44•yethiel•4h ago•24 comments

I started programming when I was 7. I'm 50 now and the thing I loved has changed

https://www.jamesdrandall.com/posts/the_thing_i_loved_has_changed/
468•jamesrandall•6h ago•414 comments

Europe's $24T Breakup with Visa and Mastercard Has Begun

https://europeanbusinessmagazine.com/business/europes-24-trillion-breakup-with-visa-and-mastercar...
489•NewCzech•9h ago•432 comments

Parse, Don't Validate (2019)

https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/
203•shirian•6h ago•123 comments

Redefining Go Functions

https://pboyd.io/posts/redefining-go-functions/
73•todsacerdoti•7h ago•21 comments