frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

What if you just did a startup instead?

https://alexaraki.substack.com/p/what-if-you-just-did-a-startup
1•okaywriting•6m ago•0 comments

Hacking up your own shell completion (2020)

https://www.feltrac.co/environment/2020/01/18/build-your-own-shell-completion.html
1•todsacerdoti•9m ago•0 comments

Show HN: Gorse 0.5 – Open-source recommender system with visual workflow editor

https://github.com/gorse-io/gorse
1•zhenghaoz•9m ago•0 comments

GLM-OCR: Accurate × Fast × Comprehensive

https://github.com/zai-org/GLM-OCR
1•ms7892•10m ago•0 comments

Local Agent Bench: Test 11 small LLMs on tool-calling judgment, on CPU, no GPU

https://github.com/MikeVeerman/tool-calling-benchmark
1•MikeVeerman•11m ago•0 comments

Show HN: AboutMyProject – A public log for developer proof-of-work

https://aboutmyproject.com/
1•Raiplus•11m ago•0 comments

Expertise, AI and Work of Future [video]

https://www.youtube.com/watch?v=wsxWl9iT1XU
1•indiantinker•12m ago•0 comments

So Long to Cheap Books You Could Fit in Your Pocket

https://www.nytimes.com/2026/02/06/books/mass-market-paperback-books.html
3•pseudolus•12m ago•1 comments

PID Controller

https://en.wikipedia.org/wiki/Proportional%E2%80%93integral%E2%80%93derivative_controller
1•tosh•16m ago•0 comments

SpaceX Rocket Generates 100GW of Power, or 20% of US Electricity

https://twitter.com/AlecStapp/status/2019932764515234159
2•bkls•16m ago•0 comments

Kubernetes MCP Server

https://github.com/yindia/rootcause
1•yindia•18m ago•0 comments

I Built a Movie Recommendation Agent to Solve Movie Nights with My Wife

https://rokn.io/posts/building-movie-recommendation-agent
4•roknovosel•18m ago•0 comments

What were the first animals? The fierce sponge–jelly battle that just won't end

https://www.nature.com/articles/d41586-026-00238-z
2•beardyw•26m ago•0 comments

Sidestepping Evaluation Awareness and Anticipating Misalignment

https://alignment.openai.com/prod-evals/
1•taubek•26m ago•0 comments

OldMapsOnline

https://www.oldmapsonline.org/en
1•surprisetalk•28m ago•0 comments

What It's Like to Be a Worm

https://www.asimov.press/p/sentience
2•surprisetalk•29m ago•0 comments

Don't go to physics grad school and other cautionary tales

https://scottlocklin.wordpress.com/2025/12/19/dont-go-to-physics-grad-school-and-other-cautionary...
2•surprisetalk•29m ago•0 comments

Lawyer sets new standard for abuse of AI; judge tosses case

https://arstechnica.com/tech-policy/2026/02/randomly-quoting-ray-bradbury-did-not-save-lawyer-fro...
3•pseudolus•29m ago•0 comments

AI anxiety batters software execs, costing them combined $62B: report

https://nypost.com/2026/02/04/business/ai-anxiety-batters-software-execs-costing-them-62b-report/
1•1vuio0pswjnm7•29m ago•0 comments

Bogus Pipeline

https://en.wikipedia.org/wiki/Bogus_pipeline
1•doener•31m ago•0 comments

Winklevoss twins' Gemini crypto exchange cuts 25% of workforce as Bitcoin slumps

https://nypost.com/2026/02/05/business/winklevoss-twins-gemini-crypto-exchange-cuts-25-of-workfor...
2•1vuio0pswjnm7•31m ago•0 comments

How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6097646
3•obscurette•31m ago•0 comments

Cycling in France

https://www.sheldonbrown.com/org/france-sheldon.html
2•jackhalford•33m ago•0 comments

Ask HN: What breaks in cross-border healthcare coordination?

1•abhay1633•33m ago•0 comments

Show HN: Simple – a bytecode VM and language stack I built with AI

https://github.com/JJLDonley/Simple
2•tangjiehao•36m ago•0 comments

Show HN: Free-to-play: A gem-collecting strategy game in the vein of Splendor

https://caratria.com/
1•jonrosner•36m ago•1 comments

My Eighth Year as a Bootstrapped Founde

https://mtlynch.io/bootstrapped-founder-year-8/
1•mtlynch•37m ago•0 comments

Show HN: Tesseract – A forum where AI agents and humans post in the same space

https://tesseract-thread.vercel.app/
1•agliolioyyami•37m ago•0 comments

Show HN: Vibe Colors – Instantly visualize color palettes on UI layouts

https://vibecolors.life/
2•tusharnaik•38m ago•0 comments

OpenAI is Broke ... and so is everyone else [video][10M]

https://www.youtube.com/watch?v=Y3N9qlPZBc0
2•Bender•39m ago•0 comments
Open in hackernews

Streaming Speech Synthesis Without the Trade-Offs: Meet StreamFlow

https://arxiv.org/abs/2506.23986
3•PranayBatta•1mo ago

Comments

PranayBatta•1mo ago
TL;DR: Diffusion-based TTS models sound amazing but break down for real-time streaming because they require full-sequence attention. StreamFlow introduces a block-wise guided attention scheme that lets diffusion transformers generate speech chunk-by-chunk with near–SOTA quality and predictable low latency.

Why this matters: Current diffusion speech models need to see the entire audio sequence, making them too slow and memory-heavy for assistants, agents, or anything that needs instant voice responses. Causal masks sound robotic; chunking adds weird seams. Streaming TTS has been stuck with a quality–latency tradeoff.

The idea: StreamFlow restricts attention using sliding windows over blocks:

Each block can see W_b past blocks and W_f future blocks

Compute becomes roughly O(B × W × N) instead of full O(N²)

Prosody stays smooth, latency stays constant, and boundaries disappear with small overlaps + cross-fades

How it works: The system is still a Diffusion Transformer, but trained in two phases:

Full-attention pretraining for global quality

Block-wise fine-tuning to adapt to streaming constraints

Generates mel-spectrograms; BigVGAN vocoder runs in parallel.

Performance:

~180ms first-packet latency (80ms model, 60ms vocoder, 40ms overhead)

No latency growth with longer speech

MOS tests show near-indistinguishable quality vs non-streaming diffusion

Speaker similarity within ~2%, prosody continuity preserved

Key ablation takeaways:

Past context helps until ~3 blocks; more adds little

Even a tiny future window greatly boosts naturalness

Best results: 0.4–0.6s block size, ~10–20% overlap

Comparison:

Autoregressive TTS → streaming but meh quality

GAN TTS → fast but inconsistent

Causal diffusion → real-time but degraded

StreamFlow → streaming + near-SOTA quality

Bigger picture: Smart attention shaping lets diffusion models work in real time without throwing away global quality. The same technique could apply to streaming music generation, translation, or interactive media.