news newest ask show jobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Strengths and limitations of diffusion language models

https://www.seangoedecke.com/limitations-of-text-diffusion-models/

72•rbanffy•8mo ago

Comments

cubefox•8mo ago

That's a nice explanation. I wonder whether autoregressive and diffusion language models could be combined such that the model only denoises the (most recent) end of a sequence of text, like a paragraph, while the rest is unchangeable and allows for key-value caching.

gfysfm•8mo ago

Hi, I wrote the post. Thank you!

That’s how it does work, but unfortunately denoising the last paragraph requires computing attention scores for every token in that paragraph, which requires checking those tokens against every token in the sequence. So it’s still much less cacheable than the equivalent autoregressive model.

billconan•8mo ago

I'm curious, in image generation, flow matching is said to be better than diffusion, then why do these language models still start from diffusion, instead of jumping to flow matching directly?

gessha•8mo ago

This is just a guess but I think it’s due to diffusion training being more popular so we’ve figured more of the kinks with those models. Flow matching models might follow after you figure out some of their hyperparameters.

mountainriver•8mo ago

A big discussion on this happened here as well https://news.ycombinator.com/item?id=44057820

There is quite a bit of evidence diffusion models work better at reasoning because they don't suffer from early token bias.

https://github.com/HKUNLP/diffusion-vs-ar https://arxiv.org/html/2410.14157v3

accrual•8mo ago

Great overview. I wonder if we'll start to see more text diffusion models from other players, or maybe even a mixture of diffusion and transformer models alternating roles behind a single UI, depending on the context and request.

shrubhub•8mo ago

The diffusion models are (or can be) transformer models! They're just not autoregressive.

Proving Laderman's 3x3 Matrix Multiplication Is Locally Optimal via SMT Solvers

https://zenodo.org/records/18514533

1•DarenWatson•1m ago•0 comments

Fire may have altered human DNA

https://www.popsci.com/science/fire-alter-human-dna/

1•wjb3•1m ago•0 comments

"Compiled" Specs

https://deepclause.substack.com/p/compiled-specs

1•schmuhblaster•6m ago•0 comments

The Next Big Language (2007) by Steve Yegge

https://steve-yegge.blogspot.com/2007/02/next-big-language.html?2026

1•cryptoz•7m ago•0 comments

Open-Weight Models Are Getting Serious: GLM 4.7 vs. MiniMax M2.1

https://blog.kilo.ai/p/open-weight-models-are-getting-serious

3•ms7892•17m ago•0 comments

Using AI for Code Reviews: What Works, What Doesn't, and Why

https://entelligence.ai/blogs/entelligence-ai-in-cli

3•Arindam1729•17m ago•0 comments

Show HN: Solnix – an early-stage experimental programming language

https://www.solnix-lang.org/

2•maheshbhatiya•17m ago•0 comments

DoNotNotify is now Open Source

https://donotnotify.com/opensource.html

5•awaaz•19m ago•2 comments

The British Empire's Brothels

https://www.historytoday.com/archive/feature/british-empires-brothels

2•pepys•19m ago•0 comments

What rare disease AI teaches us about longitudinal health

https://myaether.live/blog/what-rare-disease-ai-teaches-us-about-longitudinal-health

2•takmak007•25m ago•0 comments

The Brand Savior Complex and the New Age of Self Censorship

https://thesocialjuice.substack.com/p/the-brand-savior-complex-and-the

2•jaskaransainiz•26m ago•0 comments

Show HN: A Prompting Framework for Non-Vibe-Coders

https://github.com/No3371/projex

2•3371•27m ago•0 comments

Kilroy is a local-first "software factory" CLI

https://github.com/danshapiro/kilroy

2•ukuina•37m ago•0 comments

Mathscapes – Jan 2026 [pdf]

https://momath.org/wp-content/uploads/2026/02/1.-Mathscapes-January-2026-with-Solution.pdf

1•vismit2000•39m ago•0 comments

80386 Barrel Shifter

https://nand2mario.github.io/posts/2026/80386_barrel_shifter/

2•jamesbowman•40m ago•0 comments

Training Foundation Models Directly on Human Brain Data

https://arxiv.org/abs/2601.12053

1•helloplanets•40m ago•0 comments

Web Speech API on HN Threads

https://toulas.ch/projects/hn-readaloud/

1•etoulas•43m ago•0 comments

ArtisanForge: Learn Laravel through a gamified RPG adventure – 100% free

https://artisanforge.online/

2•grazulex•43m ago•1 comments

Your phone edits all your photos with AI – is it changing your view of reality?

https://www.bbc.com/future/article/20260203-the-ai-that-quietly-edits-all-of-your-photos

1•breve•44m ago•0 comments

DStack, a small Bash tool for managing Docker Compose projects

https://github.com/KyanJeuring/dstack

2•kppjeuring•45m ago•1 comments

Hop – Fast SSH connection manager with TUI dashboard

https://github.com/danmartuszewski/hop

1•danmartuszewski•46m ago•1 comments

Turning books to courses using AI

https://www.book2course.org/

6•syukursyakir•47m ago•3 comments

Top #1 AI Video Agent: Free All in One AI Video and Image Agent by Vidzoo AI

https://vidzoo.ai

2•Evan233•47m ago•1 comments

Ask HN: How would you design an LLM-unfriendly language?

1•sph•49m ago•0 comments

Show HN: MuxPod – A mobile tmux client for monitoring AI agents on the go

https://github.com/moezakura/mux-pod

1•moezakura•50m ago•0 comments

March for Billionaires

https://marchforbillionaires.org/

1•gscott•50m ago•0 comments

Turn Claude Code/OpenClaw into Your Local Lovart – AI Design MCP Server

https://github.com/jau123/MeiGen-Art

1•jaujaujau•51m ago•0 comments

An Nginx Engineer Took over AI's Benchmark Tool

https://github.com/hongzhidao/jsbench/tree/main/docs

1•zhidao9•53m ago•0 comments

Use fn-keys as fn-keys for chosen apps in OS X

https://www.balanci.ng/tools/karabiner-function-key-generator.html

1•thelollies•53m ago•1 comments

Sir/SIEN: A communication protocol for production outages

https://getsimul.com/blog/communicate-outage-to-ceo

1•pingananth•54m ago•1 comments