frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

"There must be something like the opposite of suicide "

https://post.substack.com/p/there-must-be-something-like-the
1•rbanffy•31s ago•0 comments

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

1•amichail•1m ago•0 comments

Show HN: Engineering Perception with Combinatorial Memetics

1•alan_sass•7m ago•1 comments

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

https://steamdaily.xyz
1•itshellboy•9m ago•0 comments

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
1•spenvo•9m ago•0 comments

Just Started Using AmpCode

https://intelligenttools.co/blog/ampcode-multi-agent-production
1•BojanTomic•10m ago•0 comments

LLM as an Engineer vs. a Founder?

1•dm03514•11m ago•0 comments

Crosstalk inside cells helps pathogens evade drugs, study finds

https://phys.org/news/2026-01-crosstalk-cells-pathogens-evade-drugs.html
2•PaulHoule•12m ago•0 comments

Show HN: Design system generator (mood to CSS in <1 second)

https://huesly.app
1•egeuysall•12m ago•1 comments

Show HN: 26/02/26 – 5 songs in a day

https://playingwith.variousbits.net/saturday
1•dmje•13m ago•0 comments

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

https://github.com/Paraxiom/topological-coherence
1•slye514•15m ago•1 comments

Top AI models fail at >96% of tasks

https://www.zdnet.com/article/ai-failed-test-on-remote-freelance-jobs/
4•codexon•16m ago•2 comments

The Science of the Perfect Second (2023)

https://harpers.org/archive/2023/04/the-science-of-the-perfect-second/
1•NaOH•17m ago•0 comments

Bob Beck (OpenBSD) on why vi should stay vi (2006)

https://marc.info/?l=openbsd-misc&m=115820462402673&w=2
2•birdculture•20m ago•0 comments

Show HN: a glimpse into the future of eye tracking for multi-agent use

https://github.com/dchrty/glimpsh
1•dochrty•21m ago•0 comments

The Optima-l Situation: A deep dive into the classic humanist sans-serif

https://micahblachman.beehiiv.com/p/the-optima-l-situation
2•subdomain•21m ago•1 comments

Barn Owls Know When to Wait

https://blog.typeobject.com/posts/2026-barn-owls-know-when-to-wait/
1•fintler•22m ago•0 comments

Implementing TCP Echo Server in Rust [video]

https://www.youtube.com/watch?v=qjOBZ_Xzuio
1•sheerluck•22m ago•0 comments

LicGen – Offline License Generator (CLI and Web UI)

1•tejavvo•25m ago•0 comments

Service Degradation in West US Region

https://azure.status.microsoft/en-gb/status?gsid=5616bb85-f380-4a04-85ed-95674eec3d87&utm_source=...
2•_____k•25m ago•0 comments

The Janitor on Mars

https://www.newyorker.com/magazine/1998/10/26/the-janitor-on-mars
1•evo_9•27m ago•0 comments

Bringing Polars to .NET

https://github.com/ErrorLSC/Polars.NET
3•CurtHagenlocher•29m ago•0 comments

Adventures in Guix Packaging

https://nemin.hu/guix-packaging.html
1•todsacerdoti•30m ago•0 comments

Show HN: We had 20 Claude terminals open, so we built Orcha

1•buildingwdavid•30m ago•0 comments

Your Best Thinking Is Wasted on the Wrong Decisions

https://www.iankduncan.com/engineering/2026-02-07-your-best-thinking-is-wasted-on-the-wrong-decis...
1•iand675•30m ago•0 comments

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

https://www.warcraftcn.com/
1•vyrotek•31m ago•0 comments

Trump Vodka Becomes Available for Pre-Orders

https://www.forbes.com/sites/kirkogunrinde/2025/12/01/trump-vodka-becomes-available-for-pre-order...
1•stopbulying•33m ago•0 comments

Velocity of Money

https://en.wikipedia.org/wiki/Velocity_of_money
1•gurjeet•35m ago•0 comments

Stop building automations. Start running your business

https://www.fluxtopus.com/automate-your-business
1•valboa•39m ago•1 comments

You can't QA your way to the frontier

https://www.scorecard.io/blog/you-cant-qa-your-way-to-the-frontier
1•gk1•41m ago•0 comments
Open in hackernews

Diffusion Beats Autoregressive in Data-Constrained Settings

https://blog.ml.cmu.edu/2025/09/22/diffusion-beats-autoregressive-in-data-constrained-settings/
72•djoldman•4mo ago

Comments

blurbleblurble•4mo ago
I have a feeling this technique might make waves: https://openreview.net/forum?id=c05qIG1Z2B#discussion
tripplyons•4mo ago
There are definitely parallels between diffusion and reasoning models, mostly being able to spend longer to get a better solution by using a more precise ODE solver for diffusion or using more tokens for reasoning.

However, due to how diffusion models are trained, they never see their own predictions as input, so they cannot learn to store information across steps. This is the complete opposite for reasoning models.

yorwba•4mo ago
You can train a diffusion model using its own predictions as input, no problem at all.
tripplyons•4mo ago
At that point it is not following a diffusion training objective. I am aware of papers that do this, but I have not seen one that shows it as a better pretraining objective than something like v-prediction or flow matching.
mxwsn•4mo ago
Why is not the diffusion training objective? The technique is known as self-conditioning right? Is it an issue with conditional Tweedie's?
blurbleblurble•4mo ago
I'm probably not understanding your point but did you look at the paper? This explicitly does diffusion in an autoencoded latent space of the autoregressive prediction itself. The starting point is that prediction, but depending on how much noise is used, the diffusion model itself directly contributes to the prediction process to some degree or another.

It should be trivial to make an encoder that has some memory of at least part of the prompt (say the tailing part) and do a diffusion step there too.

smokel•4mo ago
I fail to understand why we would lack data. Sure, there is limited (historical) text, but if we just open up all available video, and send out interactive robots into the world, we'll drown in data. Then there is simulated data, and tons of sensors that can capture vast amounts of even more data.

Edit: from the source [1], this quote pretty much sums it all up: "Our 2022 paper predicted that high-quality text data would be fully used by 2024, whereas our new results indicate that might not happen until 2028."

[1] https://epoch.ai/blog/will-we-run-out-of-data-limits-of-llm-...

Legend2440•4mo ago
>send out interactive robots into the world

Easier said than done.

Robotics tends to be even more data-constrained than NLP. The real world only runs at 1x speed, and if your robot breaks something it costs real money. Simulators are simplistic compared to reality and take a lot of manual effort to build.

You will always need to make efficient use of the data you have.

imtringued•4mo ago
Robotics data isn't labeled and if you build a robot, there ain't anyone who has collected data for your particular robot.

There is also the problem that on-device learning is not yet practical.

robots0only•4mo ago
This paper was just too overhyped by the authors. Also, the initial evals were very limited and very strange. This blog post does a much better job at a similar observation -- goes into details and does proper evaluation (also better attribution): https://jinjieni.notion.site/Diffusion-Language-Models-are-S...
thesz•4mo ago

  > This paper addresses the challenge by asking: how can we trade off more compute for less data? 
Autoregressive models are not matched by compute and this is the major drawback.

There is evidence that training RNN models that compute several steps with same input and coefficients (but different state) lead to better performance. It was shown in a followup to [1] that performed ablation study.

[1] https://arxiv.org/abs/1611.06188

They fixed number of time steps instead of varying it, and got better results.

Unfortunately, I forgot the title of that ablation paper.

kevinwang•4mo ago
Not sure if you meant this because it doesn't cite the paper you mention, but it's a similar work: "An Investigation of Model-Free Planning", Guez et Al. (Deepmind) 2019 https://arxiv.org/abs/1901.03559
astrange•4mo ago
Speaking of not citing, that one could go a bit further back.

https://cdn.aaai.org/AAAI/1987/AAAI87-048.pdf

imtringued•4mo ago
It has already been proven that deep equilibrium models with a single layer are equivalent to models with a finite number of layers and the converse as well. That you can get the performance of a DEQ using a finite number of layers.

The fixed point nature of DEQs means that they inherently have a concept of self assessment how close they are to the solution. If they are at the solution, they will simply stop changing it. If not, they will keep performing calculations.