frontpage.

We asked Neo AI to build a small voice assistant pipeline that runs with low latency on CPU instead of requiring a GPU.

The goal was to see how responsive a LLM → speech system can be on normal laptops or edge devices.

It includes: - Voice Activity Detection - CPU-friendly LLM + TTS streaming - Async pipeline to reduce latency

Modular LLM backend

Useful for local assistants, robotics prototypes, privacy-first setups, or benchmarking STT/LLM/TTS latency.

We’ve been experimenting with similar CPU-first pipelines inside NEO workflows for on-device agents, and this repo is a minimal standalone version.

Would love suggestions on lightweight STT/TTS models or latency tricks people have used on CPU.

Story of XZ Backdoor [video]

Show HN: Soften Sleep – an iOS app for waking up at 3 AM with racing thoughts

"TBPN" and the Rise of the Tech-Friendly Talk Show

Show HN: Tiqd – a checklist library for life tasks

OSS Maintainers Can Inject Their Standards into Contributors' AI Tools

Show HN: Bored, so I graphed 2M Telegram users by their gifts

NSA and IETF – The Structure of the Debate

Anthropic gives Opus 3 exit interview, "retirement" blog

Show HN: Sonde – Open-source LLM analytics (track brand mentions across LLMs)

First writing may be 40k years earlier than thought

96.5% of confusables.txt from Unicode is not high-risk

Rampant online abuse and deepfakes targeting women on Substack

Workers on training AI to do their jobs

The Forever Pollution Project

Air defence in Kyiv visible on ISS video stream [video]

zram

Ask HN: What causes Claude's '[mistake] – wait, no [correction]' pattern?

OpenAI's Kevin Weil on the Future of Scientific Discovery

OpenAI Codex and Figma launch seamless code-to-design experience

CodeSpeak, next-generation programming language powered by LLMs

"Superintelligence and Law"

Show HN: EZClaw – Deploy OpenClaw in Minutes

Hot take: movies suck because there is no rental market

Does Agents.md Help Coding Agents?

BuildKit: Docker's Hidden Gem That Can Build Almost Anything

Lessons from my overly-introspective, self-improving coding agent

Show HN: WebGL mipmap renderer for a zoomable R/place on a real world map

Is AI Making Us Dumb?

Bitly handles 93 writes/s – URL shortener interviews ask for 1160

AI outputs are increasing exponentially. What is the bottleneck?

Show HN: Kitten TTS Based Low-Latency Streaming Voice Assistant on CPU