frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Anthropic: Latest Claude model finds more than 500 vulnerabilities

https://www.scworld.com/news/anthropic-latest-claude-model-finds-more-than-500-vulnerabilities
1•Bender•3m ago•0 comments

Brooklyn cemetery plans human composting option, stirring interest and debate

https://www.cbsnews.com/newyork/news/brooklyn-green-wood-cemetery-human-composting/
1•geox•3m ago•0 comments

Why the 'Strivers' Are Right

https://greyenlightenment.com/2026/02/03/the-strivers-were-right-all-along/
1•paulpauper•4m ago•0 comments

Brain Dumps as a Literary Form

https://davegriffith.substack.com/p/brain-dumps-as-a-literary-form
1•gmays•5m ago•0 comments

Agentic Coding and the Problem of Oracles

https://epkconsulting.substack.com/p/agentic-coding-and-the-problem-of
1•qingsworkshop•5m ago•0 comments

Malicious packages for dYdX cryptocurrency exchange empties user wallets

https://arstechnica.com/security/2026/02/malicious-packages-for-dydx-cryptocurrency-exchange-empt...
1•Bender•5m ago•0 comments

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

https://github.com/pheonix-delta/axiom-voice-agent
1•shubham-coder•6m ago•0 comments

Penisgate erupts at Olympics; scandal exposes risks of bulking your bulge

https://arstechnica.com/health/2026/02/penisgate-erupts-at-olympics-scandal-exposes-risks-of-bulk...
3•Bender•6m ago•0 comments

Arcan Explained: A browser for different webs

https://arcan-fe.com/2026/01/26/arcan-explained-a-browser-for-different-webs/
1•fanf2•8m ago•0 comments

What did we learn from the AI Village in 2025?

https://theaidigest.org/village/blog/what-we-learned-2025
1•mrkO99•8m ago•0 comments

An open replacement for the IBM 3174 Establishment Controller

https://github.com/lowobservable/oec
1•bri3d•11m ago•0 comments

The P in PGP isn't for pain: encrypting emails in the browser

https://ckardaris.github.io/blog/2026/02/07/encrypted-email.html
2•ckardaris•13m ago•0 comments

Show HN: Mirror Parliament where users vote on top of politicians and draft laws

https://github.com/fokdelafons/lustra
1•fokdelafons•14m ago•1 comments

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

1•Chance-Device•15m ago•0 comments

We Mourn Our Craft

https://nolanlawson.com/2026/02/07/we-mourn-our-craft/
1•ColinWright•18m ago•0 comments

Jim Fan calls pixels the ultimate motor controller

https://robotsandstartups.substack.com/p/humanoids-platform-urdf-kitchen-nvidias
1•robotlaunch•21m ago•0 comments

Exploring a Modern SMTPE 2110 Broadcast Truck with My Dad

https://www.jeffgeerling.com/blog/2026/exploring-a-modern-smpte-2110-broadcast-truck-with-my-dad/
1•HotGarbage•22m ago•0 comments

AI UX Playground: Real-world examples of AI interaction design

https://www.aiuxplayground.com/
1•javiercr•22m ago•0 comments

The Field Guide to Design Futures

https://designfutures.guide/
1•andyjohnson0•23m ago•0 comments

The Other Leverage in Software and AI

https://tomtunguz.com/the-other-leverage-in-software-and-ai/
1•gmays•25m ago•0 comments

AUR malware scanner written in Rust

https://github.com/Sohimaster/traur
3•sohimaster•27m ago•1 comments

Free FFmpeg API [video]

https://www.youtube.com/watch?v=6RAuSVa4MLI
3•harshalone•27m ago•1 comments

Are AI agents ready for the workplace? A new benchmark raises doubts

https://techcrunch.com/2026/01/22/are-ai-agents-ready-for-the-workplace-a-new-benchmark-raises-do...
2•PaulHoule•32m ago•0 comments

Show HN: AI Watermark and Stego Scanner

https://ulrischa.github.io/AIWatermarkDetector/
1•ulrischa•33m ago•0 comments

Clarity vs. complexity: the invisible work of subtraction

https://www.alexscamp.com/p/clarity-vs-complexity-the-invisible
1•dovhyi•33m ago•0 comments

Solid-State Freezer Needs No Refrigerants

https://spectrum.ieee.org/subzero-elastocaloric-cooling
2•Brajeshwar•34m ago•0 comments

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

1•mc-0•35m ago•1 comments

From Zero to Hero: A Brief Introduction to Spring Boot

https://jcob-sikorski.github.io/me/writing/from-zero-to-hello-world-spring-boot
1•jcob_sikorski•35m ago•1 comments

NSA detected phone call between foreign intelligence and person close to Trump

https://www.theguardian.com/us-news/2026/feb/07/nsa-foreign-intelligence-trump-whistleblower
14•c420•36m ago•2 comments

How to Fake a Robotics Result

https://itcanthink.substack.com/p/how-to-fake-a-robotics-result
1•ai_critic•36m ago•0 comments
Open in hackernews

Streaming Speech Synthesis Without the Trade-Offs: Meet StreamFlow

https://arxiv.org/abs/2506.23986
3•PranayBatta•1mo ago

Comments

PranayBatta•1mo ago
TL;DR: Diffusion-based TTS models sound amazing but break down for real-time streaming because they require full-sequence attention. StreamFlow introduces a block-wise guided attention scheme that lets diffusion transformers generate speech chunk-by-chunk with near–SOTA quality and predictable low latency.

Why this matters: Current diffusion speech models need to see the entire audio sequence, making them too slow and memory-heavy for assistants, agents, or anything that needs instant voice responses. Causal masks sound robotic; chunking adds weird seams. Streaming TTS has been stuck with a quality–latency tradeoff.

The idea: StreamFlow restricts attention using sliding windows over blocks:

Each block can see W_b past blocks and W_f future blocks

Compute becomes roughly O(B × W × N) instead of full O(N²)

Prosody stays smooth, latency stays constant, and boundaries disappear with small overlaps + cross-fades

How it works: The system is still a Diffusion Transformer, but trained in two phases:

Full-attention pretraining for global quality

Block-wise fine-tuning to adapt to streaming constraints

Generates mel-spectrograms; BigVGAN vocoder runs in parallel.

Performance:

~180ms first-packet latency (80ms model, 60ms vocoder, 40ms overhead)

No latency growth with longer speech

MOS tests show near-indistinguishable quality vs non-streaming diffusion

Speaker similarity within ~2%, prosody continuity preserved

Key ablation takeaways:

Past context helps until ~3 blocks; more adds little

Even a tiny future window greatly boosts naturalness

Best results: 0.4–0.6s block size, ~10–20% overlap

Comparison:

Autoregressive TTS → streaming but meh quality

GAN TTS → fast but inconsistent

Causal diffusion → real-time but degraded

StreamFlow → streaming + near-SOTA quality

Bigger picture: Smart attention shaping lets diffusion models work in real time without throwing away global quality. The same technique could apply to streaming music generation, translation, or interactive media.