frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Start all of your commands with a comma (2009)

https://rhodesmill.org/brandon/2009/commands-with-comma/
233•theblazehen•2d ago•68 comments

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

https://openciv3.org/
694•klaussilveira•15h ago•206 comments

Hoot: Scheme on WebAssembly

https://www.spritely.institute/hoot/
6•AlexeyBrin•1h ago•0 comments

The Waymo World Model

https://waymo.com/blog/2026/02/the-waymo-world-model-a-new-frontier-for-autonomous-driving-simula...
962•xnx•20h ago•555 comments

How we made geo joins 400× faster with H3 indexes

https://floedb.ai/blog/how-we-made-geo-joins-400-faster-with-h3-indexes
130•matheusalmeida•2d ago•35 comments

Unseen Footage of Atari Battlezone Arcade Cabinet Production

https://arcadeblogger.com/2026/02/02/unseen-footage-of-atari-battlezone-cabinet-production/
67•videotopia•4d ago•6 comments

Vocal Guide – belt sing without killing yourself

https://jesperordrup.github.io/vocal-guide/
54•jesperordrup•5h ago•24 comments

Jeffrey Snover: "Welcome to the Room"

https://www.jsnover.com/blog/2026/02/01/welcome-to-the-room/
36•kaonwarb•3d ago•27 comments

ga68, the GNU Algol 68 Compiler – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/PEXRTN-ga68-intro/
10•matt_d•3d ago•2 comments

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

https://github.com/valdanylchuk/breezydemo
236•isitcontent•15h ago•26 comments

Monty: A minimal, secure Python interpreter written in Rust for use by AI

https://github.com/pydantic/monty
233•dmpetrov•16h ago•124 comments

Where did all the starships go?

https://www.datawrapper.de/blog/science-fiction-decline
32•speckx•3d ago•21 comments

Show HN: I spent 4 years building a UI design tool with only the features I use

https://vecti.com
335•vecti•17h ago•147 comments

Hackers (1995) Animated Experience

https://hackers-1995.vercel.app/
502•todsacerdoti•23h ago•244 comments

Sheldon Brown's Bicycle Technical Info

https://www.sheldonbrown.com/
386•ostacke•21h ago•97 comments

Show HN: If you lose your memory, how to regain access to your computer?

https://eljojo.github.io/rememory/
300•eljojo•18h ago•186 comments

Microsoft open-sources LiteBox, a security-focused library OS

https://github.com/microsoft/litebox
361•aktau•22h ago•185 comments

UK infants ill after drinking contaminated baby formula of Nestle and Danone

https://www.bbc.com/news/articles/c931rxnwn3lo
10•__natty__•3h ago•0 comments

An Update on Heroku

https://www.heroku.com/blog/an-update-on-heroku/
425•lstoll•21h ago•282 comments

PC Floppy Copy Protection: Vault Prolok

https://martypc.blogspot.com/2024/09/pc-floppy-copy-protection-vault-prolok.html
68•kmm•5d ago•10 comments

Dark Alley Mathematics

https://blog.szczepan.org/blog/three-points/
96•quibono•4d ago•22 comments

Was Benoit Mandelbrot a hedgehog or a fox?

https://arxiv.org/abs/2602.01122
21•bikenaga•3d ago•11 comments

The AI boom is causing shortages everywhere else

https://www.washingtonpost.com/technology/2026/02/07/ai-spending-economy-shortages/
19•1vuio0pswjnm7•1h ago•5 comments

How to effectively write quality code with AI

https://heidenstedt.org/posts/2026/how-to-effectively-write-quality-code-with-ai/
264•i5heu•18h ago•216 comments

Delimited Continuations vs. Lwt for Threads

https://mirageos.org/blog/delimcc-vs-lwt
33•romes•4d ago•3 comments

Introducing the Developer Knowledge API and MCP Server

https://developers.googleblog.com/introducing-the-developer-knowledge-api-and-mcp-server/
64•gfortaine•13h ago•28 comments

I now assume that all ads on Apple news are scams

https://kirkville.com/i-now-assume-that-all-ads-on-apple-news-are-scams/
1076•cdrnsf•1d ago•460 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
39•gmays•10h ago•13 comments

Understanding Neural Network, Visually

https://visualrambling.space/neural-network/
298•surprisetalk•3d ago•44 comments

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

https://infisical.com/blog/devops-to-solutions-engineering
154•vmatsiiako•20h ago•72 comments
Open in hackernews

Normalizing Flows Are Capable Generative Models

https://machinelearning.apple.com/research/normalizing-flows
169•danboarder•7mo ago

Comments

tiahura•7mo ago
https://github.com/bayesiains/nflows
imoverclocked•7mo ago
It’s pretty great that despite having large data centers capable of doing this kind of computation, Apple continues to make things work locally. I think there is a lot of value in being able to hold the entirety of a product in hand.
xnx•7mo ago
Google has a family of local models too! https://ai.google.dev/gemma/docs
ivape•7mo ago
Gemma and Llama can’t be bundled commercially, which sucks because they make two of the leading small llms. Qwen3 might be the last one with an Apache license.
nolist_policy•7mo ago
You can bundle and use Gemma commercially[1].

[1] https://ai.google.dev/gemma/terms

ivape•7mo ago
I’ll have to read that, thanks.
coliveira•7mo ago
It's very convenient for Apple to do this: less expenses on costly AI chips, and more excuses to ask customers to buy their latest hardware.
nine_k•7mo ago
Users have to pay for the compute somehow. Maybe by paying for models run in datacenters. Maybe paying for hardware that's capable enough to run models locally.
Bootvis•7mo ago
I can upgrade to a bigger LLM I use through an API with one click. If it runs on my device device I need to buy a new phone.
nine_k•7mo ago
I* can run the model on my device, no matter if I have an internet connection, nor if I have a permission from whoever controls the datacenter. I can run the model against highly private data while being certain that the private data never leaves my device.

It's a different set of trade-offs.

* Theoretically; I don't own an iPhone.

eru•7mo ago
Well, unless it's open source, you can't be so certain. But more certain than when processing in the cloud, that's true.
lostlogin•7mo ago
But also: if Apple's way works, it’s incredibly wasteful.

Server side means shared resources, shared upgrades and shared costs. The privacy aspect matters, but at what cost?

shakna•7mo ago
Server side means an excuse to not improve model handling everywhere you can, and increasing global power usage by noticable percentage point, at a time when we're approaching "point of no return" with burning out the only planet we can live on.

The cost, so far, is greater.

hu3•7mo ago
> Server side means an excuse to not improve model handling everywhere you can...

How so if efficiency is key for datacenters to be competitive? If anything it's the other way around.

coliveira•7mo ago
The previous commenter is right in that server-side companies have little incentive to do less, especially when they're backed by investors money. Client-side AI will be bound by device capabilities and customer investment in new devices.
shakna•7mo ago
Or, instead of improving efficiency, they go ahead and just deploy more generators [0]. Stop gap measures are cheaper.

[0] https://interestingengineering.com/innovation/elon-musk-xai-...

eru•7mo ago
Well, if it were easier to build power stations, they'd do so.
thfuran•7mo ago
More like squinting to see if it's still visible in the rear view mirror.
eru•7mo ago
How does running AI workloads on end user devices magically make them use less energy?
gessha•7mo ago
With the wave of enshitiffication that's surrounding everything tech or tech-adjacent, the privacy cost is pretty~ high.
zamadatix•7mo ago
If iPhones were the efficient/smart way to pay for compute then Apple's datacenter would be built with those instead of servers.
v5v3•7mo ago
With no company having a clear lead in everyday ai for the non technical mainstream user, there is only going to be a race to the bottom for subscription and API pricing.

Local doesn't cost the company anything, and increases the minimum hardware customers need to buy.

eru•7mo ago
> Local doesn't cost the company anything, [...]

Not completely true: those models are harder to develop. The logistics are a hassle.

ivape•7mo ago
It takes about a $400 dollar graphics card to comfortably run something like a 3b-8b model. Comfortable as in fast inference, good sized context. 3b-5b models are what devices can somewhat fit. That means for us to get good running local models, we’d have to shrink one of those $400 dollar graphics cards down to a phone.

I don’t see this happening in the next 5 years.

The Mac mini being shrunk down to phone size is probably the better bet. We’d have to bring down the power consumption requirements too by a lot. Edge hardware is a ways off.

nolist_policy•7mo ago
Gemma 3n E4B runs at 35tk/s prompt processing and 7-8 tk/s decode on my last last last gen flagship Android.
ivape•7mo ago
I doubt this. What kind of t/s are you getting once your context window is reasonably saturated? Probably slows down to a crawl making it not good enough yet (the hardware that is).
MBCook•7mo ago
I wonder if it’s noticeably faster or slower than the common way on the same set of hardware.
yorwba•7mo ago
Figure 10 in https://arxiv.org/pdf/2506.06276 has a speed comparison. You need fairly large batch sizes for this method to come out ahead. The issue is that the architecture is very sequential, so you need to be generating several images at the same time to make good use of GPU parallelism.
godelski•7mo ago
It's a bit more complicated than that and I don't think you're being fair.

StarFlow and the AR models are fixed but DiT is being compared at different amount of steps and we don't really care if we generate garbage at blazing speeds[0]. Go look at... also Figure 10 (lol) from the DiT paper[1], it compares FID to model sizes and sampling steps. It looks like StarFlow is comparing to DiT-XL/2-G. In [1] they do {16,32,64,128,256,1024} steps which corresponds to (roughly) 10k-FID of 60, 35, 25, 22, 21, 20. Translating to StarFlow's graph we'll guesstimate 21,23,50. There's a big difference between 50 and 23 but what might surprise you is that there's a big difference between 25 and 20. Remember that this is a metric that is lower bounded, and that lower bound is not 0... You also start running into the limitations of the metric the closer you get to its lower bound, adding another layer of complexity when comparing[2]

The images from the paper (I believe) are all at 250 steps, which StarFlow is beating at a batch size of 4. So let's look at batches and invert the data. It is imgs/sec so let's do (1/<guestimate of y-value>) * batch. We get this

  Batch  DiT  SF
    1    10s  20s
    2    20s  30s
    4    40s  30s
    8    80s  30s
   16   160s  30s
      ...
  
So what's happening here is that StarFlow is invariant to the batch size while DiT is not. Obviously this won't hold forever, but DiT doesn't get advantage from batching. You could probably make up these differences by caching the model because it looks like there's a turn from model loading dominating to actual generation dominating. Whereas StarFlow has that turnover at batch 2.

And batching (even small batches) is going to be pretty common, especially when talking about industry. The scaling here is a huge win to them. It (roughly) costs you just as much to generate 64 images as it does 2. Worst case scenario, you hand your customers batched outputs and they end up happier because frankly, generating images is still an iterative process and good luck getting the thing you want on just the first shot even if you got all your parameters dialed in. So yeah, that makes a much better product.

I'll also add 2 things. 1) you can get WAY more compression out of Normalizing Flows 2) there's just a ton you can do with Flows that you can't with diffusion. The explicit density isn't helpful only for the math nerds. It is helpful for editing, concept segmentation, interpolation, interpretation, and so much more.

[0] https://tvgag.com/content/quotes/6004-jpg.jpg

[1] https://arxiv.org/abs/2212.09748

[2] Basically, place exponentially growing importance on FID gaps as they lower and then abandon the importance completely because it doesn't matter. As an example, take FFHQ-256 with FID-50k. Image quality difference between 50 and 20 is really not that big, visually. But there's a *HUGE** difference between 10 and 5. Visually, probably just as big as the difference between 5 and 3. But once you start going below 3 you really shouldn't rely on the metric anymore and comparing a 2.5 model to 2.7 is difficult.

b0a04gl•7mo ago
flows make sense here not just for size but cuz they're fully invertible and deterministic. imagine running same gen on 3 iphones, same output. means apple can kinda ensure same input gives same output across devices, chips, runs. no weird variance or sampling noise. good for caching, testing, user trust all that. fits apple's whole determinism dna and more of predictable gen at scale
yorwba•7mo ago
Normalizing flows generate samples by starting from Gaussian noise and passing it through a series of invertible transformations. Diffusion models generate samples by starting from Gaussian noise and running it through an inverse diffusion process.

To get deterministic results, you fix the seed for your pseudorandom number generator and make sure not to execute any operations that produce different results on different hardware. There's no difference between the approaches in that respect.

GenerocUsername•7mo ago
Agree. I am a image gen laymen, but when I was running stable diffusion in 2022 it seemed like I could get the same image if I used the same seed and parameters. Seemed easy to get same image when you have full control of the inputs. The randomness is a choice
lnyan•7mo ago
normalizing flow might be unpopular but definitely not a forgotten technique
layer8•7mo ago
Earlier discussion: https://news.ycombinator.com/item?id=44358535
tomhow•7mo ago
Thanks. I looked at that thread and it wasn't great, with most of the comments being meta-commentary related to the article and Apple's AI progress rather than the actual research paper.

I've decided to keep this thread on the front page, move the on-topic comments from that other thread to this one, and leave the rest of it in the past.

jc4p•7mo ago
i've been trying to keep up with this field (image generation) so here's quick notes I took:

Claude's Summary: "Normalizing flows aren't dead, they just needed modern techniques"

My Summary: "Transformers aren't just for text"

1. SOTA model for likelihood on ImageNet 64×64, first ever sub 3.2 (Bits Per Dimension) prev was 2.99 by a hybrid diffusion model

2. Autoregressive (transformers) approach, right now diffusion is the most popular in this space (it's much faster but a diff approach)

tl;dr of autoregressive vs diffusion (there's also other approaches)

Autoregression: step based, generate a little then more then more

Diffusion: generate a lot of noise then try to clean it up

The diffusion approach that is the baseline for sota is Flow Matching from Meta: https://arxiv.org/abs/2210.02747 -- lots of fun reading material if you throw both of these into an LLM and ask it to summarize the approaches!

godelski•7mo ago
You have a few minor errors and I hope I can help out.

  > Diffusion: generate a lot of noise then try to clean it up
You could say this about Flows too. The history of them is shared with diffusion and goes back to the Whitening Transform. Flows work by a coordinate transform so we have an isomorphism where diffusion works through, for easier understanding, a hierarchical mixture of gaussians. Which is a lossy process (more confusing when we get into latent diffusion models, which are the primary type used). The goal of a Normalizing Flow is to turn your sampling distribution, which you don't have an explicit representation of, into a probability distribution (typically Normal Noise/Gaussian). So in effect, there are a lot of similarities here. I'd highly suggest learning about Flows if you want to better understand Diffusion Models.

  > The diffusion approach that is the baseline for sota is Flow Matching from Meta
To be clear, Flow Matching is a Normalizing Flow. Specifically, it is a Continuous and Conditional Normalizing Flow. If you want to get into the nitty gritty, Ricky has a really good tutorial on the stuff[0]

[0] https://arxiv.org/abs/2412.06264

jc4p•7mo ago
thank you so much!!! i should’ve put that final sentence in my post!
godelski•7mo ago
Happy to help and if you have any questions just ask, this is my jam
godelski•7mo ago
As far as I'm aware, this is the largest Normalizing Flow that exists, and I think they undermined their work by not mentioning this...

Their ImageNet model (4_1024_8_8_0.05[0]) is ~820M while AFHQ is ~472M. Prior to that there is DenseFlow[1] and MaCow[2], which are both <200M parameters. For more comparison, that makes DenseFlow and MaCow smaller than iDDPM[3] (270M params) and ADM[4] (553M for 256 unconditional). And now, it isn't uncommon for modern diffusion models to have several billion parameters![5] (from this we get some numbers on ImageNet-256, which allows a direct comparison, making TarFlow closer to MaskDiT/2 and much smaller than SimpleDiffusion and VDM++, both of which are in billions. But note that this is 128 vs 256!)

Essentially, the argument here is that you can scale (Composable) Normalizing Flows just as well as diffusion models. There's a lot of extra benefits you get too in the latent space, but that's a much longer discussion. Honestly, the TarFlow method is simple and there's probably a lot of improvements that can be made. But don't take that as a knock on this paper! I actually really appreciated it and it really set out to show what they tried to show. The real thing is just no one trained flows at this scale before and this really needs to be highlighted.

The tldr: people have really just overlooked different model architectures

[0] Used a third party reproduction so might be different but their AFHQ-256 model matches at 472M params https://github.com/encoreus/GS-Jacobi_for_TarFlow

[1] https://arxiv.org/abs/2106.04627

[2] https://arxiv.org/abs/1902.04208

[3] https://arxiv.org/abs/2102.09672

[4] https://arxiv.org/abs/2105.05233

[5] https://arxiv.org/abs/2401.11605

[Side note] Hey, if the TarFlow team is hiring, I'd love to work with you guys

yorwba•7mo ago
In the follow-up, they go all the way to 3.8 billion parameters: https://machinelearning.apple.com/research/starflow
godelski•7mo ago
Thanks! Idk how I missed that one. Really glad they put that extra information in
kleskling•7mo ago
I've been working on a JAX implementation for my own projects. I've implemented everything in the paper except guidance.

See here: https://github.com/homerjed/transformer_flow

I'm happy to see the return of normalising flows - exact likelihood models have many benefits. I found the model needed soft-clipping on some operations to ensure numerical stability.

I wonder if adding transformers can be done for the GLOW algorithm since attention and 1x1 convolutions could be made to do the same operation.