frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
568•klaussilveira•10h ago•160 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
885•xnx•16h ago•538 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
89•matheusalmeida•1d ago•20 comments

What Is Ruliology?

https://writings.stephenwolfram.com/2026/01/what-is-ruliology/
16•helloplanets•4d ago•8 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
16•videotopia•3d ago•0 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
195•isitcontent•10h ago•24 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
197•dmpetrov•11h ago•88 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
305•vecti•13h ago•136 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
352•aktau•17h ago•173 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
348•ostacke•16h ago•90 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
20•romes•4d ago•2 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
450•todsacerdoti•18h ago•228 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
78•quibono•4d ago•16 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
50•kmm•4d ago•3 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
248•eljojo•13h ago•150 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
384•lstoll•17h ago•260 comments

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
11•neogoose•3h ago•6 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
228•i5heu•13h ago•173 comments

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

https://github.com/phreda4/r3
66•phreda4•10h ago•11 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
113•SerCe•6h ago•90 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
134•vmatsiiako•15h ago•59 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
42•gfortaine•8h ago•12 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
23•gmays•5h ago•4 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
263•surprisetalk•3d ago•35 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1038•cdrnsf•20h ago•429 comments

Learning from context is harder than we thought

https://hy.tencent.com/research/100025?langVersion=en
165•limoce•3d ago•87 comments

FORTH? Really!?

https://rescrv.net/w/2026/02/06/associative
59•rescrv•18h ago•22 comments

Show HN: ARM64 Android Dev Kit

https://github.com/denuoweb/ARM64-ADK
14•denuoweb•1d ago•2 comments

Show HN: Smooth CLI – Token-efficient browser for AI agents

https://docs.smooth.sh/cli/overview
86•antves•1d ago•63 comments

Evaluating and mitigating the growing risk of LLM-discovered 0-days

https://red.anthropic.com/2026/zero-days/
47•lebovic•1d ago•14 comments
Open in hackernews

Open-source framework for real-time AI voice

https://github.com/videosdk-live/agents
27•sagarkava•6mo ago

Comments

sagarkava•6mo ago
Hey

I’m Sagar, co-founder of VideoSDK.

I'm beyond excited to share what we've been building: VideoSDK Real-Time AI Agents. Today, voice is becoming the new UI.

We expect agents to feel human, to understand us, respond instantly, and work seamlessly across web, mobile, and even telephony. But, to achieve this, developers have to stitch together: STT, LLM, TTS, glued with HTTP endpoints and, a prayer.

This most often results in agents that sound robotic, hallucinations and fail in product environments without observability. So we built something to solve that.

Now, we are open sourcing it!

Here’s what it offers:

- Global WebRTC infra with <80ms latency - Native turn detection, VAD, and noise suppression - Modular pipelines for STT, LLM, TTS, avatars, and - real-time model switching - Built-in RAG + memory for grounding and hallucination resistance - SDKs for web, mobile, Unity, IoT, and telephony — no glue code needed - Agent Cloud to scale infinitely with one-click deployments — or self-host with full control Think of it like moving from a walkie-talkie to a modern network tower that handles thousands of calls.

VideoSDK gives you the infrastructure to build voice agents that actually work in the real world, at scale.

I'd love your thoughts and questions! Happy to dive deep into architecture, use cases, or crazy edge cases you've been struggling with.

esafak•6mo ago
Do you watermark the output to enable fraud detection?
bigcat12345678•6mo ago
Good! Is there way to prompt the TTS output tone like elevenlabs https://elevenlabs.io/docs/best-practices/prompting/eleven-v...

We are building AI companions, the tone prompting would be great

bigcat12345678•6mo ago
Got to hn frontpage and ignore comments on the post...
httpsterio•6mo ago
and made three accounts to add more praise lol. This should be removed.
sagarkava•6mo ago
Hey bigcat12345678, great question!

Yes, with VideoSDK's Real-Time AI Agents, you can control the TTS output tone, either via prompt engineering (if your TTS provider supports it, like ElevenLabs) or by integrating custom models that support tonal control directly. Our modular pipeline architecture makes it easy to plug in providers like ElevenLabs and pass tone/style prompts dynamically per utterance.

We actually support ElevenLabs out of the box. You can check out the integration details here: https://docs.videosdk.live/ai_agents/plugins/tts/eleven-labs

So if you're building AI companions and want them to sound calm, excited, empathetic, etc., you can absolutely prompt for those tones in real time, or even switch voices or tones mid-conversation based on context or user emotion.

Let us know what you're building. Happy to dive deeper into tone control setups or help debug a specific flow!

chopete3•6mo ago
Is this running in production at any site/company?.
sagarkava•6mo ago
Yes, VideoSDK Real-Time AI Agents are already running in production with several partners across different domains — from healthcare assistants to customer support agents and AI companions. These deployments are handling real user interactions at scale, across web, mobile, and even telephony.

If you're curious about specific use cases or want to explore how it can fit into your product, happy to share more details or walk through an example.

vivzkestrel•6mo ago
how does it compare to chatterbox TTS? https://github.com/resemble-ai/chatterbox/
sagarkava•6mo ago
Chatterbox is great for local/private TTS with Resemble AI.

voice agent SDK is broader it's full real-time voice infra with STT, LLM, TTS, memory, and RAG built in. You can plug in Resemble, ElevenLabs, etc., and deploy across web, mobile, and telephony with <80ms latency.

monadoid•6mo ago
Why would I use this vs @openai/openai-agents-python (or openai-agents-ts) - the new realtime agents SDKs?

There are so many AI frameworks out there that live & die so quickly that I am generally hard pressed to use any of these unless there is some killer feature I absolutely need.

avsdk•6mo ago
We're not a model ourselves—we provide the infrastructure that enables you to deploy and use any model of your choice, while simplifying communication through AI agents.
sagarkava•6mo ago
Totally fair. The space moves fast, and it's smart to be skeptical. Here's how VideoSDK Real-Time AI Agents stand out from OpenAI agents SDKs and others:

1. Voice infra included OpenAI agents handle logic and memory, but they don’t include real-time audio infra.

VideoSDK gives you:

- <80ms global WebRTC latency

- Built-in turn-taking, VAD, and noise suppression

- Real-time voice across web, mobile, IoT, and telephony

2. Fully modular pipeline No vendor lock-in. Swap STT, LLM, TTS, and avatars. Change models live per user or use case. Want ElevenLabs for tone and OpenAI for reasoning? Easy.

3. Native RAG + memory Integrated long-term memory and retrieval help reduce hallucinations and keep conversations grounded.

4. Scale-ready Deploy globally with one click using Agent Cloud or self-host with full control. Built for production use.

If you're building real-time, voice-first agents that need to work across platforms and scale reliably, this is purpose-built for that.

Happy to dive into your use case if you're exploring options.

oldgregg•6mo ago
No demo? No demo video? Nothing?
sagarkava•6mo ago
Hey! Quick video overview: https://www.youtube.com/watch?v=m_oc1GDyhrc

Live demo to try it out: https://aiagent.tryvideosdk.live