frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Prejudice Against Leprosy

https://text.npr.org/g-s1-108321
1•hi41•1m ago•0 comments

Slint: Cross Platform UI Library

https://slint.dev/
1•Palmik•5m ago•0 comments

AI and Education: Generative AI and the Future of Critical Thinking

https://www.youtube.com/watch?v=k7PvscqGD24
1•nyc111•5m ago•0 comments

Maple Mono: Smooth your coding flow

https://font.subf.dev/en/
1•signa11•6m ago•0 comments

Moltbook isn't real but it can still hurt you

https://12gramsofcarbon.com/p/tech-things-moltbook-isnt-real-but
1•theahura•10m ago•0 comments

Take Back the Em Dash–and Your Voice

https://spin.atomicobject.com/take-back-em-dash/
1•ingve•10m ago•0 comments

Show HN: 289x speedup over MLP using Spectral Graphs

https://zenodo.org/login/?next=%2Fme%2Fuploads%3Fq%3D%26f%3Dshared_with_me%25253Afalse%26l%3Dlist...
1•andrespi•11m ago•0 comments

Teaching Mathematics

https://www.karlin.mff.cuni.cz/~spurny/doc/articles/arnold.htm
1•samuel246•14m ago•0 comments

3D Printed Microfluidic Multiplexing [video]

https://www.youtube.com/watch?v=VZ2ZcOzLnGg
2•downboots•14m ago•0 comments

Abstractions Are in the Eye of the Beholder

https://software.rajivprab.com/2019/08/29/abstractions-are-in-the-eye-of-the-beholder/
2•whack•14m ago•0 comments

Show HN: Routed Attention – 75-99% savings by routing between O(N) and O(N²)

https://zenodo.org/records/18518956
1•MikeBee•14m ago•0 comments

We didn't ask for this internet – Ezra Klein show [video]

https://www.youtube.com/shorts/ve02F0gyfjY
1•softwaredoug•15m ago•0 comments

The Real AI Talent War Is for Plumbers and Electricians

https://www.wired.com/story/why-there-arent-enough-electricians-and-plumbers-to-build-ai-data-cen...
2•geox•18m ago•0 comments

Show HN: MimiClaw, OpenClaw(Clawdbot)on $5 Chips

https://github.com/memovai/mimiclaw
1•ssslvky1•18m ago•0 comments

I Maintain My Blog in the Age of Agents

https://www.jerpint.io/blog/2026-02-07-how-i-maintain-my-blog-in-the-age-of-agents/
3•jerpint•18m ago•0 comments

The Fall of the Nerds

https://www.noahpinion.blog/p/the-fall-of-the-nerds
1•otoolep•20m ago•0 comments

I'm 15 and built a free tool for reading Greek/Latin texts. Would love feedback

https://the-lexicon-project.netlify.app/
2•breadwithjam•23m ago•1 comments

How close is AI to taking my job?

https://epoch.ai/gradient-updates/how-close-is-ai-to-taking-my-job
1•cjbarber•23m ago•0 comments

You are the reason I am not reviewing this PR

https://github.com/NixOS/nixpkgs/pull/479442
2•midzer•25m ago•1 comments

Show HN: FamilyMemories.video – Turn static old photos into 5s AI videos

https://familymemories.video
1•tareq_•27m ago•0 comments

How Meta Made Linux a Planet-Scale Load Balancer

https://softwarefrontier.substack.com/p/how-meta-turned-the-linux-kernel
1•CortexFlow•27m ago•0 comments

A Turing Test for AI Coding

https://t-cadet.github.io/programming-wisdom/#2026-02-06-a-turing-test-for-ai-coding
2•phi-system•27m ago•0 comments

How to Identify and Eliminate Unused AWS Resources

https://medium.com/@vkelk/how-to-identify-and-eliminate-unused-aws-resources-b0e2040b4de8
3•vkelk•28m ago•0 comments

A2CDVI – HDMI output from from the Apple IIc's digital video output connector

https://github.com/MrTechGadget/A2C_DVI_SMD
2•mmoogle•28m ago•0 comments

CLI for Common Playwright Actions

https://github.com/microsoft/playwright-cli
3•saikatsg•29m ago•0 comments

Would you use an e-commerce platform that shares transaction fees with users?

https://moondala.one/
1•HamoodBahzar•31m ago•1 comments

Show HN: SafeClaw – a way to manage multiple Claude Code instances in containers

https://github.com/ykdojo/safeclaw
3•ykdojo•34m ago•0 comments

The Future of the Global Open-Source AI Ecosystem: From DeepSeek to AI+

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3
3•gmays•35m ago•0 comments

The Evolution of the Interface

https://www.asktog.com/columns/038MacUITrends.html
2•dhruv3006•36m ago•1 comments

Azure: Virtual network routing appliance overview

https://learn.microsoft.com/en-us/azure/virtual-network/virtual-network-routing-appliance-overview
3•mariuz•36m ago•0 comments
Open in hackernews

TiDAR: Think in Diffusion, Talk in Autoregression

https://arxiv.org/abs/2511.08923
130•internetguy•2mo ago

Comments

Alifatisk•2mo ago
I've tried dLLMs like Mercury and they look promising.
Workaccount2•2mo ago
An update to Gemini diffusion is one of my most eagerly anticipated AI releases. It released to mild fanfare (mostly because you needed to request access to use it), and there has been silence ever since.

Hopefully it's not more Google abandonware, because it was wicked fast and a delight to use

ACCount37•2mo ago
It's not a very promising direction because autoregressive LLMs still deliver better output quality per model weight, as a rule.

Now, is it possible that a model can combine advantages of both? Combine fast generation and multidirectional causality of diffusion with precision, capabilities and generalization of autoregression?

Maybe. This paper is research in that direction. So far, it's not a clear upgrade over autoregressive LLMs.

euleriancon•2mo ago
Diffusion LMs do seem to be able to get more out of the same data. In a world where we are already training transformer based LLMs on all text available, diffusion LMs ability to continue learning on a fixed set of data may be able to outperform transformers

https://arxiv.org/abs/2511.03276

nbardy•2mo ago
There’s another paper that shows you can get the same effect by training auto regression on Fill in the middle data.

So it’s more about the mask modeling objective than Diffusion.

albertzeyer•2mo ago
Which paper is that?
ilaksh•2mo ago
4-5 times faster with minimal change in quality seems like a clear upgrade in efficiency.
zaptrem•2mo ago
Latency may be better, but throughput (the thing companies care about) may be the same or worse, since every step the entire diffusion window has to be passed through the model. With AR models only the most recent token goes through, which is much more compute efficient allowing you to be memory bound. Trade off with these models is more than one token per forward pass, but idk the point where that becomes worth it (probably depends on model and diffusion window size)
fragmede•2mo ago
> still deliver better output quality per model weight, as a rule.

is it possible to quantify that and just have a linked slider for quality and speed? If I can get an answer that's 80% right in 1/10th the time, and then iterate on that who comes out ahead?

jrk•2mo ago
Yes but you can also do the same thing with autoregressive models just by making them smaller. This tradeoff always exists, the question is whether the Pareto curve for diffusion models ever crosses or dominates the best autoregressive option at the same throughput (or quality).
ricochet11•2mo ago
Perhaps it’s an issue is that text often has directionality.

https://arxiv.org/abs/2401.17505

vintermann•2mo ago
As a rule, but the devil is in the details. The thing, the one big thing I want to use multimodal LLMs for, is accessing the data in historical mostly handwritten texts.

None of the big LLMs do an acceptable job. This is a task a trained human can do, but it's a lot of work. You have to learn, not just the script style of the period (which can vary far more than people think), but even the idiosyncracies of a given writer. All the time, you run into an unreadable word, and you need to look around for context which might give a clue, or other places the same word (or a similar looking word) is used in cleaner contexts. It's very much not a beginning-to-end task, trying to read a document from start to end would be like solving a crossword puzzle in strict left to right, top to bottom order.

Maybe autoregressive models can eventually become powerful enough that they can just do that! But so far, they haven't. And I have a lot more faith in that the diffusion approach is closer to how you have to do it.

ACCount37•2mo ago
That looks like something that can be solved by autoregressive models of today, no architectural changes needed.

What you need is: good image understanding, at least GPT-5 tier, general purpose reasoning over images training, and then some domain-specific training, or at least some few-shot guidance to get it to adopt the correct reasoning patterns.

If I had to guess which model would be able to do it best out of the box, few-shot, I'd say Gemini 3 Pro.

There is nothing preventing an autoregressive LLM from revisiting images and rewriting the texts as new clues come in. This is how they can solve puzzles like sudoku.

vintermann•2mo ago
Try for yourself, if you want to:

https://urn.digitalarkivet.no/URN:NBN:no-a1450-rg60085808000...

gdiamos•2mo ago
Diffusion is favored by current GPUs .

Over time we seem to have a tendency to build models that are well matched to our machines

HPsquared•2mo ago
Are TPUs different?
vlovich123•2mo ago
Not really. The problem is that transformer LLMs are autoregressive and are O(n^2) for self attention and also require insane amounts of bandwidth to “page in” the weights into the relevant compute parts. TPUs do this faster than a CPU like any accelerator but fundamentally this is a challenge. There are attempts to build hardware where the weights are burned into the silicon but that carries other meaningful downsides.

But op is referring to the fact that diffusion is friendlier on both bandwidth and not needing large n^2 compute blocks in the critical path.

thethirdone•2mo ago
In this paper both the diffusion and the auto-regressive models are transformers with O(n^2) performance for long sequences. They share the "Exact KV Cache" for committed tokens.

Diffusion just allows you to spend more compute at the same time so you don't redundantly access the same memory. It can only improve speed beyond the memory bandwidth limit by committing multiple tokens each pass.

Other linear models like Mamba get away from O(n^2) effects, but type of neural architecture is orthogonal to the method of generation.

Bolwin•2mo ago
That's bizarre because I would expect the opposite. For reasoning you go step by step, and when you're done quickly diffuse the answer
naasking•2mo ago
Unification in logic programming isn't a forwards-only process, so there's no reason to expect deduction in an AI to proceed in a sort of procedural step by step fashion either. What ultimately matters is that all of the various deductions unify coherently in the end.
octoberfranklin•2mo ago
Exactly.

If you add a "cheat" rule that lets you deduce anything from something else, then replacing these cheat rule applications with real subgoal proofs is denoising for Natural Deduction.

wongarsu•2mo ago
However after step 4 you might notice that you made a mistake in step 2 and revise it. You might think in steps, but the state you are building is formed a bit diffusion-like