frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Why xor eax, eax?

https://xania.org/202512/01-xor-eax-eax
82•hasheddan•1h ago•16 comments

Self-hosting a Matrix server for 5 years

https://yaky.dev/2025-11-30-self-hosting-matrix/
94•the-anarchist•2h ago•32 comments

Search tool that only returns content created before ChatGPT's public release

https://tegabrain.com/Slop-Evader
563•dmitrygr•9h ago•218 comments

Games using anti-cheats and their compatibility with GNU/Linux or Wine/Proton

https://areweanticheatyet.com/
99•doener•6h ago•99 comments

Advent of Code 2025

https://adventofcode.com/2025/about
1053•vismit2000•1d ago•339 comments

Detection of triboelectric discharges during dust events on Mars

https://gizmodo.com/weve-detected-lightning-on-mars-for-the-first-time-2000691996
70•domofutu•4d ago•38 comments

It’s been a very hard year

https://bell.bz/its-been-a-very-hard-year/
164•surprisetalk•8h ago•172 comments

A Love Letter to FreeBSD

https://www.tara.sh/posts/2025/2025-11-25_freebsd_letter/
354•rbanffy•15h ago•218 comments

Trifold is a tool to quickly and cheaply host static websites using a CDN

https://www.jpt.sh/projects/trifold/
50•birdculture•1w ago•9 comments

Writing a good Claude.md

https://www.humanlayer.dev/blog/writing-a-good-claude-md
585•objcts•19h ago•211 comments

Advent of Sysadmin 2025

https://sadservers.com/advent
276•lazyant•12h ago•80 comments

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

https://huggingface.co/deepseek-ai/DeepSeek-Math-V2
178•victorbuilds•4h ago•58 comments

1GB Raspberry Pi 5, and memory-driven price rises

https://www.raspberrypi.com/news/1gb-raspberry-pi-5-now-available-at-45-and-memory-driven-price-r...
71•shrx•2h ago•34 comments

SmartTube Compromised

https://www.aftvnews.com/smarttubes-official-apk-was-compromised-with-malware-what-you-should-do-...
100•akersten•8h ago•74 comments

Dancing rope and braid into being (2017) [pdf]

https://archive.bridgesmathart.org/2017/bridges2017-523.pdf
15•surprisetalk•6d ago•0 comments

N-Body Simulator – Interactive 3 Body Problem and Gravitational Physics

https://trisolarchaos.com/?pr=lagrange&n=3&s=5.0&so=0.01&im=verlet&dt=5.00e-4&rt=1.0e-6&at=1.0e-8...
75•speckx•6d ago•13 comments

X210Ai is a new motherboard to upgrade ThinkPad X201/200

https://www.tpart.net/about-x210ai/
121•walterbell•10h ago•42 comments

Algorithms for Optimization [pdf]

https://algorithmsbook.com/optimization/files/optimization.pdf
293•Anon84•14h ago•26 comments

Google Antigravity just deleted the contents of whole drive

https://old.reddit.com/r/google_antigravity/comments/1p82or6/google_antigravity_just_deleted_the_...
321•tamnd•9h ago•237 comments

Victorian-style lines for the web: Elements of identical width

https://jacobfilipp.com/victorian-line/
8•surprisetalk•6d ago•1 comments

Windows drive letters are not limited to A-Z

https://www.ryanliptak.com/blog/windows-drive-letters-are-not-limited-to-a-z/
462•LorenDB•1d ago•235 comments

Engineers repurpose a mosquito proboscis to create a 3D printing nozzle

https://techxplore.com/news/2025-11-repurpose-mosquito-proboscis-3d-nozzle.html
68•T-A•4d ago•28 comments

GitHub to Codeberg: my experience

https://eldred.fr/blog/forge-migration/
297•todsacerdoti•21h ago•112 comments

Migrating Dillo from GitHub

https://dillo-browser.org/news/migration-from-github/
382•todsacerdoti•23h ago•193 comments

Replacing My Window Manager with Google Chrome

https://foxmoss.com/blog/dote/
76•foxmoss•3d ago•17 comments

Ly – A lightweight TUI (ncurses-like) display manager for Linux and BSD

https://codeberg.org/fairyglade/ly
71•modinfo•13h ago•11 comments

Accenture dubs 800k staff 'reinventors' amid shift to AI

https://www.theguardian.com/business/2025/dec/01/accenture-rebrands-staff-reinventors-ai-artifici...
32•n1b0m•3h ago•14 comments

Cuddle Fish – A Soft Floating Robot for Safe Physical Interaction

https://kaikunze.de/post/2025-11.18-cuddle-fish/
8•kgarten•1w ago•5 comments

AI just proved Erdos Problem #124

https://www.erdosproblems.com/forum/thread/124#post-1892
213•nl•1d ago•73 comments

ETH-Zurich: Digital Design and Computer Architecture; 227-0003-10L, Spring, 2025

https://safari.ethz.ch/ddca/spring2025/doku.php?id=start
174•__rito__•20h ago•18 comments
Open in hackernews

Diffusion models explained simply

https://www.seangoedecke.com/diffusion-models-explained/
168•onnnon•6mo ago

Comments

user14159265•6mo ago
https://lilianweng.github.io/posts/2021-07-11-diffusion-mode...
Philpax•6mo ago
Notably, Lilian did not explain diffusion models simply. This is a fantastic resource that details how they actually work, but your casual reader is unlikely to develop any sort of understanding from this.
Y_Y•6mo ago
> your casual reader is unlikely to develop any sort of understanding [from this]

"Hell, if I could explain it to the average person, it wouldn't have been worth the Nobel prize." - Richard Feynman

CamperBob2•6mo ago
Didn't he also say that if you couldn't explain something to an 8-year-old, you didn't understand it yourself?
Y_Y•6mo ago
Fair point. The context of that quote was that he was asked by a journalist for a quick explanation over the phone when the physics Nobel for 1965 was announced.

He did go on to write a very readable little book (from a lecture series) on the subject which has photons wearing little watches and waiting for the hands to line up. I'd say a keen eight-year-old could get something from that.

https://ia600101.us.archive.org/17/items/richard-feynman-pdf...

kmitz•6mo ago
Thanks, I was looking for an article like this, with a focus on the differences between generative AI techniques. My guess is that since LLMs and image generation became mainstream at the same time, most people don't have the slightest idea they are based on fundamentally different technologies.
cubefox•6mo ago
That's a nice high-level explanation: short and easy to understand.
cubefox•6mo ago
It's nice that this contains a comparison between diffusion models that are used for image models, and the autoregressive models that are used for LLMs.

But recently (2024 NeuIPS paper of the year) there was a new paper on autoregressive image modelling that apparently outperforms diffusion models: https://arxiv.org/abs/2404.02905

The innovation is that it doesn't predict image patches (like older autoregressive image models) but somehow does some sort of "next scale" or "next resolution" prediction.

In the past, autoregressive image models did not perform as well as diffusion models, which meant that most image models used diffusion. Now it seems autoregressive techniques have a strict advantage over diffusion models. Another advantage is that they can be integrated with autoregressive LLMs (multimodality), which is not possible with diffusion image models. In fact, the recent GPT-4o image generation is autoregressive according to OpenAI. I wonder whether diffusion models still have a future now.

earthnail•6mo ago
From what I can tell, it doesn't look like the recent GPT-4o image generation includes the research of the NeurIPS paper you cited. If it did, we wouldn't see a line-by-line generation of the image, which we do currently in GPT-4o, but rather a decoding similar to progressive JPEG.

I'm not 100% convinced that diffusion models are dead. That paper fixes autoregression for 2D spaces by basically turning the generation problem from pixel-by-pixel to iterative upsampling, but if 2D was the problem (and 1D was not), why don't we have more autoregressive models in 1D spaces like audio?

famouswaffles•6mo ago
>From what I can tell, it doesn't look like the recent GPT-4o image generation includes the research of the NeurIPS paper you cited. If it did, we wouldn't see a line-by-line generation of the image, which we do currently in GPT-4o, but rather a decoding similar to progressive JPEG.

You could, because it's still autoregressive. It still generates patches left to right, top to bottom. It's just that we're not starting with patches at the target resolution.

cubefox•6mo ago
> From what I can tell, it doesn't look like the recent GPT-4o image generation includes the research of the NeurIPS paper you cited.

Which means autoregressive image models are even ahead of diffusion on multiple fronts, i.e. both in whatever GPT-4o is doing and in the method described in the VAR paper.

rudedogg•6mo ago
> From what I can tell, it doesn't look like the recent GPT-4o image generation includes the research of the NeurIPS paper you cited. If it did, we wouldn't see a line-by-line generation of the image, which we do currently in GPT-4o, but rather a decoding similar to progressive JPEG.

Going off my bad memory, but I think I remember a comment saying the line-by-line generation was just a visual effect.

famouswaffles•6mo ago
>The innovation is that it doesn't predict image patches (like older autoregressive image models) but somehow does some sort of "next scale" or "next resolution" prediction.

It still predicts image patches, left to right and top to bottom. The main difference is that you start with patches at a low resolution.

porphyra•6mo ago
Meanwhile, if you want diffusion models explained with math for a graduate student, there's Tony Duan's Diffusion Models From Scratch.

[1] https://www.tonyduan.com/diffusion/index.html

bcherry•6mo ago
"The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material."

- Michelangelo

jdthedisciple•6mo ago
Not to be that guy but an article on diffusion models with only one image ... and that too just noise?
ActorNightly•6mo ago
The thing to understand about any model architecture is that there isn't really anything special about one or the other - as long as the process differentiable, ML can learn it.

You can build an image generator that basically renders each word on one line in an image, and then uses a transformer architecture to morph the image of the words into what the words are describing.

They only big difference is really efficiency, but we are just taking stabs at the dark at this point - there is work that Google is doing that eventually is going to result in the most optimal model for a certain type of task.

noosphr•6mo ago
Without going into too much detail: the complexity space of tensor operations is for all practical purposes infinite. The general tensor which captures all interactions between all elements of an input of length N is NxN.

This is worse than exponential and means we have nothing but tricks to try and solve any problem that we see in reality.

As an example solving mnist and its variants of 28x28 pixels will be impossible until the 2100s because we don't have enough memory to store the general tensor which stores the interactions between group of pixels with every other group pixels.

joefourier•6mo ago
While true in a theoretical sense (an MLP of sufficient size can theoretically represent any differentiable function), in practice it’s often the case that it’s impossible for a certain architecture to learn a specific task no matter how much compute you throw at it. E.g. an LSTM will never capture long range dependencies that a transformer could trivially learn, due to gradients vanishing after a certain sequence length.
ActorNightly•6mo ago
You are right with respect to ordering of operations, where recurrent networks have a whole bunch of other computational complexity to them.

However, for example, a Transformer can be represented with just deeply connected layers, albeit with a lot of zeros for weights.

g42gregory•6mo ago
One of the key intuitions: If you take a natural image and add random noise, you will get a different random noise image every time you do this. However, all of these (different!) random noise images will be lined up in the direction perpendicular to the natural images manifold.

So you will always know where to go to restore the original image: shortest distance to the natural image manifold.

How all these random images end up perpendicular to the manifold? High dimensional statistics and the fact that the natural image manifold has much lower dimension than the overall space.

yubblegum•6mo ago
TIL.

Generative Visual Manipulation on the Natural Image Manifold

https://arxiv.org/abs/1609.03552

For me, the most intriguing aspect of LLMs (and friends) are the embedding space and the geometry of the embedded manifolds. Curious if anyone has looked into comparative analysis of the geometry of the manifolds corresponding to distinct languages. Intuitively I see translations as a mapping from one language manifold to another, with expressions being paths on that manifold, which makes me wonder if there is a universal narrative language manifold that captures 'human expression semantics' in the same way as a "natural image manifold".

Ey7NFZ3P0nzAe•6mo ago
I think this is related: https://news.ycombinator.com/item?id=44054425
fisian•6mo ago
I found this course very helpful if you're interested in a bit of math (but all very well explained): https://diffusion.csail.mit.edu/

It is short, with good lecture notes and has hands on examples that are very approachable (with solutions available if you get stuck).

woolion•6mo ago
Discussed on hn: https://news.ycombinator.com/item?id=43238893

I found it to be the best resource to understand the material. That's certainly a good reference to delve deeper into the intuitions given by OP (it's about 5 hours of lectures, plus exercises).

IncreasePosts•6mo ago
Are there any diffusion models for text? I'd imagine they'd be very fast, if the whole result can be processed simultaneously, instead of outputting a linear series of tokens that each depend on the last
imbnwa•6mo ago
Need a text diffusion model to output a version of Eden!Eden!Eden!
woadwarrior01•6mo ago
Diffusion for text is a nascent field. There are a few pretrained models. Here's one[1], AFAIK it's currently the largest open weights text diffusion model.

[1]: https://ml-gsai.github.io/LLaDA-demo/

intalentive•6mo ago
This explanation is intuitive: https://www.youtube.com/watch?v=zc5NTeJbk-k

My takeaway is that diffusion "samples all the tokens at once", incrementally, rather than getting locked in to a particular path, as in auto-regression, which can only look backward. The upside is global context, the downside is fixed-size output.

orbital-decay•6mo ago
That a not a good intuition to have. That backwards-looking pathfinding process is actually pretty similar in both types of models - it just works along a different coordinate, crude-to-detailed instead of start-to-end.
intalentive•6mo ago
Good point.
petermcneeley•6mo ago
This page is full of text. I am guessing the author (Sean Goedecke) is a language based thinker.
JoeDaDude•6mo ago
Coincidentally, I was just watching this explanation earlier today:

How AI Image Generators Work (Stable Diffusion / Dall-E) - Computerphile

https://www.youtube.com/watch?v=1CIpzeNxIhU

bicepjai•6mo ago
>>>CLASSIFIER-FREE GUIDANCE … During inference, you run once with a caption and once without, and blend the predictions (magnifying the difference between those two vectors). That makes sure the model is paying a lot of attention to the caption.

Why is this sentence true ? “That makes sure the model is paying a lot of attention to the caption.”

noodletheworld•6mo ago
Mmm… how is a model with a fixed size, let’s say, 512x512 (ie. 64x64 latent or whatever), able to output coherent images at a larger size, let’s say, 1024x1024?

Not in a “kind of like this” kind of way: PyTorch vector pipelines can’t take arbitrary sized inputs at runtime right?

If you input has shape [x, y, z] you cannot pass [2x, 2y, 2z] into it.

Not… “it works but not very well”; like, it cannot execute the pipeline if the input dimensions aren’t exactly what they were when training.

Right? Isn’t that how it works?

So, is the image chunked into fixed patches and fed through in parts? Or something else?

For example, (1) this toy implementation resizes the input image to match the expected input, and always emits an output of a specific fixed size.

Which is what you would expect; but also, points to tools like stable diffusion working in a way that is distinctly different to what the trivial explanation tend to say does?

[1] - https://github.com/uygarkurt/UNet-PyTorch/blob/main/inferenc...

swyx•6mo ago
> That last point indicates an interesting capability that diffusion models have: you get a kind of built-in quality knob. If you want fast inference at the cost of quality, you can just run the model for less time and end up with more noise in the final output2. If you want high quality and you’re happy to take your time getting there, you can keep running the model until it’s finished removing noise.

not quite right... anyone who has run models for >100 steps knows that you can go too far. whts the explanation of that?