frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

UGMM-NN: Univariate Gaussian Mixture Model Neural Network

https://arxiv.org/abs/2509.07569
23•zakeria•3h ago

Comments

zakeria•3h ago
uGMM-NN is a novel neural architecture that embeds probabilistic reasoning directly into the computational units of deep networks. Unlike traditional neurons, which apply weighted sums followed by fixed nonlinearities, each uGMM-NN node parameterizes its activations as a univariate Gaussian mixture, with learnable means, variances, and mixing coefficients.
vessenes•1h ago
Meh. Well, at least, possibly “meh”.

Upshot: Gaussian sampling along the parameters of nodes rather than a fixed number. This might offer one of the following:

* Better inference time accuracy on average

* Faster convergence during training

It probably costs additional inference and training compute.

The paper demonstrates worse results on MNIST, and shows the architecture is more than capable of dealing with the Iris test (which I hadn’t heard of; categorizing types of irises, I presume the flower, but maybe the eye?)

The paper claims to keep the number of parameters and depth the same, but it doesn’t report as to

* training time/flops (probably more I’d guess?)

* inference time/flops (almost certainly more)

Intuitively if you’ve got a mean, variance and mix coefficient, then you have triple the data space per parameter — no word as to whether the networks were normalized as to total data taken by the NN or just the number of “parameters”.

Upshot - I don’t think this paper demonstrates any sort of benefit here or elucidates the tradeoffs.

Quick reminder, negative results are good, too. I’d almost rather see the paper framed that way.

zakeria•1h ago
Thanks for the comment. Just to clarify, the uGMM-NN isn't simply "Gaussian sampling along the parameters of nodes."

Each neuron is a univariate Gaussian mixture with learnable mean, variance, and mixture weights. This gives the network the ability to perform probabilistic inference natively inside its architecture, rather than approximating uncertainty after the fact.

The work isn’t framed as "replacing MLPs." The motivation is to bridge two research traditions:

- probabilistic graphical models and probabilistic circuits (relatively newer)

- deep learning architectures

That's why the Iris dataset (despite being simple) was included - not as a discriminative benchmark, but to show the model could be trained generatively in a way similar to PGMs, something a standard MLP cannot do. Hence, the other benefits of the approach mentioned in the paper.

ericdoerheit•1h ago
Thank you for your work! I would be interested to see what this means to a CNN architecture. Maybe it wouldn't actually be needed to have the whole architecture based on uGMM-NNs but only the last layers?
zakeria•47m ago
Thanks - good question, in theory, the uGMM layer could complement CNNs in different ways - for example, one could imagine (as you mentioned):

using standard convolutional layers for feature extraction,

then replacing the final dense layers with uGMM neurons to enable probabilistic inference and uncertainty modeling on top of the learned features.

My current focus, however, is exploring how uGMMs translate into Transformer architectures, which could open up interesting possibilities for probabilistic reasoning in attention-based models.

magicalhippo•51m ago
I'm having a very dense moment I think, and it's been far to long since ny statistics courses.

They state the output of a neuron j is a log density P_j(y), where y is a latent variable.

But how does the output from the previous layer, x, come into play?

I guess I was expecting some kind of conditional probabilities, ie the output is P_j given x or something.

Again, perhaps trivial. Just struggling to figure out how it works in practice.

ChatGPT Developer Mode: Full MCP client access

https://platform.openai.com/docs/guides/developer-mode
326•meetpateltech•6h ago•162 comments

Show HN: Term.everything – Run any GUI app in the terminal

https://github.com/mmulet/term.everything
546•mmulet•1d ago•86 comments

Pontevedra, Spain declares its entire urban area a "reduced traffic zone"

https://www.greeneuropeanjournal.eu/made-for-people-not-cars-reclaiming-european-cities/
583•robtherobber•12h ago•755 comments

Defeating Nondeterminism in LLM Inference

https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/
158•jxmorris12•4h ago•53 comments

KDE launches its own distribution (again)

https://lwn.net/SubscriberLink/1037166/caa6979c16a99c9e/
15•Bogdanp•35m ago•5 comments

The HackberryPi CM5 handheld computer

https://github.com/ZitaoTech/HackberryPiCM5
118•kristianpaul•2d ago•35 comments

Christie's Deletes Digital Art Department

https://news.artnet.com/market/christies-scraps-digital-art-department-2685784
8•recursive4•41m ago•3 comments

Launch HN: Recall.ai (YC W20) – API for meeting recordings and transcripts

49•davidgu•6h ago•27 comments

Mux (YC W16) Is Hiring Engineering ICs and Managers

https://mux.com/jobs
1•mmcclure•1h ago

Dotter: Dotfile manager and templater written in Rust

https://github.com/SuperCuber/dotter
40•nateb2022•3h ago•18 comments

OrioleDB Patent: now freely available to the Postgres community

https://supabase.com/blog/orioledb-patent-free
343•tosh•10h ago•115 comments

Show HN: Haystack – Review pull requests like you wrote them yourself

https://haystackeditor.com
43•akshaysg•4h ago•23 comments

Longhorn – A Kubernetes-Native Filesystem

https://vegard.blog.engen.priv.no/?p=518
14•jandeboevrie•3d ago•9 comments

Clojure's Solutions to the Expression Problem

https://www.infoq.com/presentations/Clojure-Expression-Problem/
30•adityaathalye•3d ago•1 comments

I didn't bring my son to a museum to look at screens

https://sethpurcell.com/writing/screens-in-museums/
671•arch_deluxe•6h ago•241 comments

Jiratui – A Textual UI for interacting with Atlassian Jira from your shell

https://jiratui.sh/
98•gjvc•7h ago•26 comments

Harvey Mudd Miniature Machine

https://www.cs.hmc.edu/~cs5grad/cs5/hmmm/documentation/documentation.html
36•nill0•2d ago•13 comments

"No Tax on Tips" Includes Digital Creators, Too

https://www.hollywoodreporter.com/business/business-news/no-tax-on-tips-guidance-creators-trump-t...
51•aspenmayer•5h ago•67 comments

Show HN: HumanAlarm – Real people knock on your door to wake you up

https://humanalarm.com
12•soelost•1h ago•13 comments

Show HN: TailGuard – Bridge your WireGuard router into Tailscale via a container

https://github.com/juhovh/tailguard
84•juhovh•18h ago•22 comments

UGMM-NN: Univariate Gaussian Mixture Model Neural Network

https://arxiv.org/abs/2509.07569
23•zakeria•3h ago•6 comments

Kerberoasting

https://blog.cryptographyengineering.com/2025/09/10/kerberoasting/
131•feross•10h ago•47 comments

Zoox robotaxi launches in Las Vegas

https://zoox.com/journal/las-vegas
151•krschultz•7h ago•196 comments

Charlie Kirk killed at event in Utah

https://www.nbcnews.com/news/us-news/live-blog/live-updates-shooting-charlie-kirk-event-utah-rcna...
416•david927•3h ago•827 comments

The origin story of merge queues

https://mergify.com/blog/the-origin-story-of-merge-queues
64•jd__•6h ago•19 comments

Tarsnap is cozy

https://til.andrew-quinn.me/posts/tarsnap-is-cozy/
86•hiAndrewQuinn•10h ago•57 comments

Things you can do with a debugger but not with print debugging

https://mahesh-hegde.github.io/posts/what_debugger_can/
184•never_inline•3d ago•180 comments

TikTok has turned culture into a feedback loop of impulse and machine learning

https://www.thenexus.media/tiktok-won-now-everything-is-60-seconds/
246•natalie3p•6h ago•182 comments

Semantic Line Breaks (2017)

https://sembr.org
71•Bogdanp•3d ago•48 comments

Distributing your own scripts via Homebrew

https://justin.searls.co/posts/how-to-distribute-your-own-scripts-via-homebrew/
59•ingve•2d ago•12 comments