GAN Math (2020)

https://jaketae.github.io/study/gan-math/

123•sebg•6h ago

Comments

colesantiago•5h ago

Aren't GANs like ancient?

Last time I used a GAN was in 2015, still interesting to see a post about GANs now and then.

GaggiX•5h ago

Adversarial loss is still use on most image generators, diffusion/autoregressive models work on a latent space (they don't have to, but it would incredibly inefficient) created by an autoencoder, these autoencoders are trained on several losses, usually L1/L2, LPIPS and adversarial.

radarsat1•5h ago

Whenever someone says this I like to point out that they are very often used to train the VAE and VQVAE models that LDM models use. Slowly diffusion is encroaching on its territory with 1-step models, however, and there are now alternative methods to generate rich latent spaces and decoders too, so this is changing, but I'd say up until last year most of the image generators still used an adversarial objective for the encoder-decoder training. This year, not sure..

lukeinator42•4h ago

they're also used a lot for training current TTS and audio codec models to output speech that sounds realistic.

pilooch•4h ago

Exactly, for real time applications VTO, simulators,...), i.e. 60+FPS, diffusion can't be used efficiently. The gap is still there afaik. One lead has been to distill DPM into GANs, not sure this works for GANs that are small enough for real time.

black_puppydog•5h ago

Yeah, title needs (2020) added.

GANs were fun though. :)

sylos•4h ago

The article is from 2020, so it would be closer to relevancy back then.

aDyslecticCrow•3h ago

GAN is not a architecture its a training method. As the models themselves change underneath, GAN remain relevant. (Just as you see autoencoder still being used as a term in new published works, which is even older.)

Though if you can rephrase the problem into a diffusion it seems to be prefered these days. (Less prone to mode collapse)

Gan is famously used for generative usecases, but has wide uses for creating useful latent spaces with limited data, and show up in few-shot-learning-papers. (Im actually not that up to speed on the state of art in few-shot so mabie they have something clever that replace it)

gchadwick•3h ago

Whilst it's maybe not worth studying them in detail I'd say being aware of their existence and roughly how they work is still useful. Seeing the many varied ways people have done things with neural networks can be useful inspiration for your own ideas and perhaps the ideas and techniques behind GANs will find a new life or a new purpose.

Yes you can just concentrate on the latest models but if you want a better grounding in the field some understanding of the past is important. In particular reusing ideas from the past in a new way and/or with better software/hardware/datasets is a common source of new developments!

programjames•2h ago

They're used as a small regularization term in image/audio decoders. But GANs have a different learning dynamic (Z6 rather than Z1 or Z2) which makes them pretty unstable to train unless you're using something like Bayesian neural networks, so they fell out of favor for the entire image generation process.

mindcrime•1h ago

Turing machines are ancient as well.

staticelf•5h ago

Reading an article like this makes me realize I am too stupid to ever build a foundation model from scratch.

reactordev•5h ago

Haha, I was just going to say the same. I was hoping, I guess naively, that this would explain the math. Not just show me math. While I love a good figure, I like pseudocode just as much :)

hoppp•5h ago

It takes a while to get into, just like with everything determination is key

Also there are libraries that abstract away most if not all the things, so you don't have to know everything

staticelf•5h ago

That's the thing, it's too hard to learn so I rather do something else with the limited time I have left.

gregorygoc•4h ago

I come from a country which had a strong Soviet influence, and in school basically we were taught that behind every hard formula lies an intuitive explanation. As otherwise, there’s no way to come up with the formula in the first place.

This statement is not true, there are counter examples I encountered in my university studies but I would say that intuition will get you very far. Einstein was able to come up with special theory of relativity by just manipulating mental models after all. Only when he tried to generalize it, that’s when he hit the limit of the claim I learned in school.

That being said after abandoning intuition, relying on pure mathematical reasoning drives you to a desired place and from there you usually can reason about the theorem in an intuitive way again.

Math in this paper is not that hard to learn, you just need someone to present you the key idea.

aeonik•1h ago

I wasn't taught this, but came to this conclusion after much struggle, and I think this mentality has served me very well.

I hope anyone who is unsure will read your comment and at least try to follow it for a while.

oersted•4h ago

Paper authors (and this posts author apparently) like to throw in lots of scary-looking maths to signal that they are smart and that what they are doing has merit. The Reinforcement Learning field is particularly notorious for doing this, but it's all over ML. Often it is not on purpose, everyone is taught this is the proper "formal" way to express these things, and that any other representation is not precise or appropriate in a scientific context.

In practice, when it comes down to code, even without higher-level libraries, it is surprisingly simple, concise and intuitive.

Most of the math elements used have quite straightforward properties and utility, but of course if you combine them all together into big expressions with lots of single-character variables, it's really hard to understand for everyone. You kind of need to learn to squint your eyes and understand the basic building-blocks that the maths represent, but that shouldn't be necessary if it wasn't obfuscated like this.

MattPalmer1086•3h ago

Haha, recognise. I invented a fast search algorithm and worked with some academics to publish a paper on it last year.

They threw in all the complex math to the paper. I could not initially understand it at all despite inventing the damn algorithm!

Having said that, picking it apart and taking a little time with it, it actually wasn't that hard - but it sure looked scary and incomprehensible at first!

catgary•2h ago

I’m going to push back on this a bit. I think a simpler explanation (or at least one that doesn’t involve projecting one’s own insecurities onto the authors) is that the people who write these papers are generally comfortable enough with mathematics that they don’t believe anything has been obfuscated. ML is a mathematical science and many people in ML were trained as physicists or mathematicians (I’m one of them). People write things this way because it makes symbolic manipulations easier and you can keep the full expression in your head; what you’re proposing would actually make it significantly harder to verify results in papers.

voidhorse•2h ago

Agreed. Also, fwiw, the mathematics involved in the paper are pretty simple as far as mathematical sophistication goes. Spend two to three months on one "higher level" maths course of your choosing and you'll be able to fully understand every equation in this paper relatively easily. Even a basic course in information theory coupled with some discrete maths should give you essentially all you need to comprehend the math in this post. The concepts being presented here are not mysterious and much of this math is banal. Mathematical notation can seem foreboding, but once you grasp it, you'll see, like Von Neumann said, that life is complicated but math is simple.

gcanyon•47m ago

> like Von Neumann said, that life is complicated but math is simple

Maybe for Von Neumann math was simple...

Garlef•2h ago

Maybe.

But my experience as a mathematician tells me another part of that story.

Certain fields are much more used to consuming (and producing) visual noise in their notation!

Some fields have even superfluous parts in their definitions and keep them around out of tradition.

It's just as with code: Not everyone values writing readable code highly. Some are fine with 200 line function bodies.

And refactoring mathematics is even harder: There's no single codebase and the old papers don't disappear.

catgary•18m ago

Maybe! I’ve found that people usually don’t do extra work if they don’t need to. The heavy notation in differential geometry, for example, can be awfully helpful when you’re actually trying to do Lagrangian mechanics on a Riemannian manifold. And superfluous bits of a definition might be kept around because going from the minimal definition to the one that is actually useful in practice can sometimes be non-trivial, so you’ll just keep the “superfluous” definition in your head.

woah•2h ago

The big zig zaggy "E" is a for loop. That's all you really have to know

ilzmastr•2h ago

If you ever wondered about the generalization to multiple classes, there is a reason that the gans look totally different:

https://proceedings.mlr.press/v137/kavalerov20a/kavalerov20a...

It turns out 2 classes is special. Better to add the classes as side information rather than try to make it part of the main objective.

Take us North is EXPOSING the modern audience [video]

Vibing options for whoever you are

Show HN: FunnelBro 3000 – An AI that generates hustle-bro strategies

Anthropic Will Now Train Claude on Your Chats

Func Prog Podcast #9 with Hécate

Why Radiology AI Didn't Work and What Comes Next

A Snake Hunt in God's Country

A Federal Appellate Court Finds the NLRB to Be Unconstitutional

Seeing infrared: contact lenses that grant 'super-vision'

The biggest frogs build their own ponds

Skills You Need to Develop to Be a Better CTO (2017)

The Anti-Autocracy Handbook: Scholars Guide to Navigating Democratic Backsliding

Benedict Evans: Why AI Isn't What You Think

Long context GPT-OSS fine-tuning

Why AI Models Are Bad at Verifying Photos

A Denisovan skull is upending the story of human evolution

Show HN: DataCompose – PyJanitor-style dataframe cleaning for PySpark

Fasting may affect metabolism and immune response differently in the obese

Medicare Will Require Prior Approval for Certain Procedures

Exoplan: Health-Driven Calendar

Interactive Monty Hall Problem Simulator with Probability Visualization

Show HN: I built the ATS YC said would never work

Building the Space Industry in Colombia

Reevaluating the revolution that fed the world

A conservative vision for AI alignment

The Economics of Envy

Health Effects of Cousin Marriage: Evidence from US Genealogical Records

Solana Consensus – From Forks to Finality

Affiliates Flock to 'Soulless' Scam Gambling Machine

'Isn't Designed to Solve Privacy Concerns,' Grafana CTO on Bring Your Own Cloud