Jürgen Schmidhuber：the Father of Generative AI Without Turing Award

http://www.jazzyear.com/article_info.html?id=1352

60•kleiba•4h ago

Comments

nharada•2h ago

Doesn’t he know the Turing Award is really just a generalization of the Fields Medal, an award that actually came years earlier?

logicchains•2h ago

I'm sure he wouldn't object to a Fields Medal either.

triceratops•1h ago

I chuckled but I also maybe didn't understand. Is the joke that computer science is a generalization of math? That can't be rigth.

dgacmu•56m ago

No, the joke is that schmidhuber is known for (rightly or wrongly) pointing to modern contributions in deep neural networks and saying they're just a trivial generalization/adaptation/etc. of work he did 30 years ago.

belval•2h ago

Every so often Schmidhuber is brought back to the front-page of HN, people will argue that he "invented it all" while others will say that he's a-posteriori claiming all the good ideas were his.

Relativity Priority Dispute: https://en.wikipedia.org/wiki/Relativity_priority_dispute

We all stand on the shoulders of giants, things can be invented and reinvented and ideas can appear twice in a vacuum.

kleiba•2h ago

But as far as I understand, Schmidhuber's claim is more severe: namely that Bengio, Hinton and LeCun intentionally failed to cite prior work by others (including himself) but instead only cited each other in order to boost their respective scientific reputation.

I personally think that he's not doing himself or his argument of favor by presenting it the way he does. While he basically argues that science should be totally objective and neutral, there's no denying that if you put yourself in a less likeable light, you're not going to make any friends.

On the other hand, he's gone at length with compiling detailed references to support his points. I can appreciate that because it makes his argument a lot less hand-wavey: you can go to his blog and compare the cited references yourself. Except that I couldn't because I'm not an ML expert.

jll29•1h ago

I have seen many cases where people -- accidentally as well as intentionally -- copied or re-invented the work of others (a friend posted on LinkedIn that someone else plagiarized his whole Ph.D. thesis, including the title, so only the name was changed; only the references to seperately published papers on which the individual chapters were based at the end still had my friend's name in it, so you could see it was a fake thesis).

If a bona fide scientist makes a mistake about missing attribution, they would correct it as soon as possible. Many, however, would not correct such a re-discovery, because it's embarrassing.

But the worst is when people don't even imagine anything like what they are working on could already exist, and they don't even bother finding and reading related work -- in other words, ignorance. Science deserves better, but there are more and more ignorant folks around that want to ignore all work before them.

godelski•45m ago

  > Many, however, would not correct such a re-discovery, because it's embarrassing.

This is a culture thing that needs to change.

I'm a pretty big advocate of open publishing and avoiding the "review process" as it stands today. The reason is because we shouldn't be chasing these notions of novelty and "impact". They are inherently subjective and lead to these issues of credit. Your work isn't diminished because you independently invented it, rather, that strengthens your work. There's more evidence! Everything is incremental and so all this stuff does is makes us focus more on trying to show our uniqueness rather than showing our work. The point of publishing is to communicate. The peer review process only happens post communicating: when people review, replicate, build on, or build against. We're just creating an overly competitive environment. It is only "embarrassing" because it "undermines" the work. It only "undermines" the work because how we view credit.

Consider this as a clear example. Suppose you want to revisit a work but just scale it up and use on modern hardware. You could get state of the art results but if you admit to such a thing with no claimed changes (let's say you literally just increase number of layers) you'll never get published. You'll get responses about how we "already knew this" and "obviously it scales". But no one tested it... right? That's just bad for science. It's bad if we can't do mundane boring shit.

Lerc•2h ago

I can see how someone could feel like that if they looked at the world in a particular way.

I have had plenty of ideas in the last few years that I have played with that I have seen published in papers in the following months. Rather than feeling like "I did it first" I feel gratified that not only was I on the right track, but someone else has done the hard slog.

Most papers are not published by people who had the idea the day before. Their work goes back further than that. Refining the idea, testing it and then presenting the results takes time, sometimes years, occasionally decades.

If this happens to you, don't think "Hey! That idea belongs to me!". Thank them for proving you right.

Now if they patent it, that's a different story. I don't think the ideas that sometimes float through my brain belong to me, but I'm not keen on them belonging to someone else either.

kleiba•2h ago

I think that's slightly mispresenting Schmidhuber's case though, because he does not just say "oh, I already had that same idea before you, I just follow up on it". He is usually referring to work that he or members of his group (or third-party researchers for that amatter) did, in fact, already publish.

Kranar•16m ago

This claim is sheer hubris. There is a big difference between spending years researching and working on a nascent subject and publishing it in academic journals, only to have someone else come along and use most of your ideas without attribution and reaping huge rewards for doing so... and having some random ideas float about in your head on a nice summer afternoon by the lake, and then a few months later find out that someone else also shared those same ideas and managed to work through them into something fruitful.

Now whether what Schmidhuber claims is what actually happened or not I don't know... but that is his claim and it's fundamentally different from what you are describing.

Mond_•2h ago

Oh boy, I sure can't wait to see the comments on this one!

Schmidhuber sure seems to be a personality, and so far I've mostly heard negative things about his "I invented this" attitude to modern research.

kleiba•2h ago

A lot of this is because nobody likes braggers - however, in all fairness, his argument is that a lot of what is considered modern ML is based on many previous results, including but not limited to his own research.

goldemerald•2h ago

No discussion with Schmidhuber is complete without the infamous debate at NIPS 2016 https://youtu.be/HGYYEUSm-0Q?t=3780 . One of my goals as a ML researcher is to publish something and have Schmidhuber claim he's already done it.

But more seriously, I'm not a fan of Schmidhuber because even if he truly did invent all this stuff early in the 90s, he's inability to see its application to modern compute held the field back by years. In principle, we could have had GANs and self-supervised models' years earlier if he had "revisited his early work". It's clear to me no one read his early paper's when developing GANs/self-supervision/transformers.

andy99•2h ago

It's very common in science for people to have had results they didn't understand the significance of that later were popularized by someone else.

There is the whole thing with Damadian claiming to have invented MRI (he didn't) when the Nobel prize went to Mansfield and Lauterbur (see the Nobel prize part of the article). https://en.m.wikipedia.org/wiki/Paul_Lauterbur

And I've seen other less prominent examples.

It's a lot like the difference between ideas and execution and people claiming someone "stole" their idea because they made a successful business from it.

nextos•1h ago

I think he did understand both the significance of his work and the importance of hardware. His group pioneered porting models to GPUs.

But personal circumstances matter a lot. He was stuck at IDSIA in Lugano, i.e. relatively small and not-so-well funded academia.

He could have done much better in industry, with access to lots of funding, a bigger headcount, and serious infrastructure.

Ultimately, models matter much less than infrastructure. Transformers are not that important, other architectures such as deep SSMs or xLSTM are able to achieve comparable results.

godelski•1h ago

  > if he had "revisited his early work".

Given that you're a researcher yourself I'm surprised by this comment. Have you not yourself experienced the harsh rejection of "not novel"? That sounds like a great way to get stuck in review hell. (I know I've experienced this even when doing novel things just by too closely relating it to other methodologies when explaining "oh, it's just ____").

The other part seems weird too. Who isn't upset when their work doesn't get recognized and someone else gets credit. Are we not all human?

cma•35m ago

His group actually used GPUs early (earlier) on and won a competition but didn't get the same press.

mindcrime•1h ago

I'll probably get flamed to death for saying this, but I like Jürgen. I mean, I don't know him in person (never met him) but I've seen a lot of his written work and interviews and what-not and he seems like an alright guy to me. Yes, I get it... there's that whole "ooooh, Jürgen is always trying to claim credit for everything" thing and all. But really, to me, it doesn't exactly come off that way. Note that he's often pointing out the lack of credit assigned even to people who lived and died centuries before him.

His "shtick" to me isn't just about him saying "people didn't give me credit" but it seems more "AI people in general haven't credited the history of the field properly." And in many cases he seems to have a point.

noosphr•1h ago

It's a clash of cultures.

He is an academic that cares for understanding where ideas came from. His detractors need to be the smartest people in the room to get paid millions and raise billions.

It's not very sexy to say 'Oh yes, we are just using an old Soviet learning algorithm on better hardware. Turns out we would have lost the cold war if the USSR had access to a 5090.", which won't get you the billions you need to build the supercomputers that push the state of the art today.

godelski•1h ago

I think you sum up my feelings about him as well. He's a bit much sometimes but it's hard to deny that he's made monumental contributions to the field.

It's also funny that we laugh at him when we also have a joke that in AI we just reinvent what people did in the 80's. He's just the person being more specific as to what and who.

Ironically, I think the problem is we care too much about credit. It ends up getting hoarded rather than shared. We then just oversell our contributions because if you make the incremental improvements that literally everyone does, you get your works rejected for being incremental.

I don't know what it is about CS specifically, but we have a culture problem or attribution and hype. From building on open source, it's libraries all the way down, but we act like we did it all alone. To jumping on bandwagons as if there's a right and immutable truth to how to do certain things, until the bubbles pop and we laugh at how stupid anyone was to do such a thing. Yet we don't contribute back to those projects that have US foundation, we laugh at "theory" which we stand on, and we listen to the same hype train people who got it wrong last time instead of turning to those who got it right. Why? It goes directly counter to the ideas of a group who love to claim rationalism, "working from first principles", and "I care what works"

voidhorse•12m ago

> we laugh at "theory" which we stand on

This aspect of the industry really annoys me to no end. People in this field are so allergic to theory (which is ironic because CS, of all fields, is probably one of the ones in which theoretical investigations are most directly applicable) that they'll smugly proclaim their own intelligence and genius while showing you a pet implementation of ideas that have been around since the 70s or earlier. Sure, most of the time they implement it in a new context, but this leads to a fragmented language in which the same core ideas are implemented N times with everyone particular personal ignorant terminology choices (see for example, the wide array of names for basic functional data structure primitives like map, fold, etc. that abound across languages).

FL33TW00D•1h ago

If you guys were the inventors of Facebook, you’d have invented Facebook

noosphr•1h ago

Before people say that he is claiming credit for things he didn't do, or that he invented everything, please read his own paper on the subject:

https://people.idsia.ch/~juergen/deep-learning-history.html

The history section starts in 1676.

swyx•1h ago

> 1676: The Chain Rule For Backward Credit Assignment

Schmidhuber is nothing but a stickler for backward credit assignment

ur-whale•1h ago

If there ever was an example of a terrible personality getting in the way of career, Schmidhuber is most definitely it.

You may have had many brilliant ideas, but everyone makes an abrupt 180 when they see the tip your beard turn the corner at conferences, that can't be a good signal for getting awards.

banq•53m ago

Why did Google give birth to the Transformer? Because Google created an ecosystem where everything could flourish, while the old man in Switzerland lacked such an environment—what could even the smartest and greatest individual do against that?

As an organization, fostering an organically growing context is like governing a great nation with delicate care. A bottom-up (organic growth) environment is the core context for sustained innovation and development!

esafak•43m ago

The fact that the field of machine learning keeps "discovering" things already established in other fields, and christening them with new names does lend some credence to Schmidhuber. The field is more industrial than academic, and cares about money more than credit, and industrial-scale data theft is all in a day's work.

As another commenter said, his misfortune is being in a lab with no industrial affiliation.

voidhorse•21m ago

I haven't read the article or paper yet, but if the gist I'm getting from the comments is correct, Schmidhuber is generally correct about industry having horrible citation practices. I even see it at a small scale at work. People often fail or forget to mention the others that helped them generate their ideas.

I would not be at all surprised if this behavior extended to research papers published by people in industry as opposed to academia. Good citation practice simply does not exist in industry. We're lucky in any of the thousand blog posts that reimplement some idea that was cranked out ages ago in academic circles are even aware of the original effort, let alone cite it. Citations are few and far between in industry literature generally. Obviously there are exceptions and this just my personal observation, I haven't done or found any kind of meta literary study illustrating such.

React Zero-UI: Instant UI updates, ZERO re-renders, ZERO runtime

Verlet Integration and Cloth Physics Simulation

Rexx Free Tools List

Wayne County overpays employee $1.6M in single paycheck, 2 fired

Show HN: TypeScript MCP Server

LLMs don't understand my code, cuz I didn't comment

Meta CTO Bosworth says OpenAI countered lucrative job offers to employees

Fix "pulsing" sensation when charging MacBook

The Death of the Student Essay–and the Future of Cognition

AtomicOS – A security-first OS with real crypto and deterministic language

Programmatic SEO: Scaling Organic Traffic with Automated, Data-Driven Pages

With only 8% built, Texas defunds state border wall program

Show HN: Track Budget – A simple, powerful personal finance tracker

The FPGA Turns 40!

Running and storing 3M LLM AI requests without spending $100k

Micro Live Hacking Incident – BBC2 (1983) [video]

Show HN: Tree-hugger-JS: CSS selectors for JavaScript AST analysis and MCP

AI sceptic Emily Bender: 'The emperor has no clothes'

AI and the existential question about language

Drinks in glass bottles contain more microplastics than those in other container

Data driven home purchase community

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets

Exercise-induced CLCF1 attenuates age-related muscle and bone decline in mice

MCP vs. A2A (In 6 Minutes)

They Trusted ChatGPT to Plan Their Hike – and Ended Up Calling for Rescue

BitVM3: Efficient Computation on Bitcoin [pdf]

Study: Meta AI model can reproduce almost half of Harry Potter book

Palantir: Profits, Power and the Kill Machine

Akkurat Typeface

Show HN: Prpolish, a CLI that uses AI to write and review your GitHub PRs

Jürgen Schmidhuber：the Father of Generative AI Without Turing Award

Comments

React Zero-UI: Instant UI updates, ZERO re-renders, ZERO runtime

Verlet Integration and Cloth Physics Simulation

Rexx Free Tools List

Wayne County overpays employee $1.6M in single paycheck, 2 fired

Show HN: TypeScript MCP Server

LLMs don't understand my code, cuz I didn't comment

Meta CTO Bosworth says OpenAI countered lucrative job offers to employees

Fix "pulsing" sensation when charging MacBook

The Death of the Student Essay–and the Future of Cognition

AtomicOS – A security-first OS with real crypto and deterministic language

Programmatic SEO: Scaling Organic Traffic with Automated, Data-Driven Pages

With only 8% built, Texas defunds state border wall program

Show HN: Track Budget – A simple, powerful personal finance tracker

The FPGA Turns 40!

Running and storing 3M LLM AI requests without spending $100k

Micro Live Hacking Incident – BBC2 (1983) [video]

Show HN: Tree-hugger-JS: CSS selectors for JavaScript AST analysis and MCP

AI sceptic Emily Bender: 'The emperor has no clothes'

AI and the existential question about language

Drinks in glass bottles contain more microplastics than those in other container

Data driven home purchase community

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets

Exercise-induced CLCF1 attenuates age-related muscle and bone decline in mice

MCP vs. A2A (In 6 Minutes)

They Trusted ChatGPT to Plan Their Hike – and Ended Up Calling for Rescue

BitVM3: Efficient Computation on Bitcoin [pdf]

Study: Meta AI model can reproduce almost half of Harry Potter book

Palantir: Profits, Power and the Kill Machine

Akkurat Typeface

Show HN: Prpolish, a CLI that uses AI to write and review your GitHub PRs