Andrej Karpathy: Software in the era of AI [video]

https://www.youtube.com/watch?v=LCEmiRjPEtQ

191•sandslash•4h ago

Comments

gchamonlive•3h ago

I think it's interesting to juxtapose traditional coding, neural network weights and prompts because in many areas -- like the example of the self driving module having code being replaced by neural networks tuned to the target dataset representing the domain -- this will be quite useful.

However I think it's important to make it clear that given the hardware constraints of many environments the applicability of what's being called software 2.0 and 3.0 will be severely limited.

So instead of being replacements, these paradigms are more like extra tools in the tool belt. Code and prompts will live side by side, being used when convenient, but none a panacea.

karpathy•1h ago

I kind of say it in words (agreeing with you) but I agree the versioning is a bit confusing analogy because it usually additionally implies some kind of improvement. When I’m just trying to distinguish them as very different software categories.

miki123211•43m ago

What do you think about structured outputs / JSON mode / constrained decoding / whatever you wish to call it?

To me, it's a criminally underused tool. While "raw" LLMs are cool, they're annoying to use as anything but chatbots, as their output is unpredictable and basically impossible to parse programmatically.

Structured outputs solve that problem neatly. In a way, they're "neural networks without the training". They can be used to solve similar problems as traditional neural networks, things like image classification or extracting information from messy text, but all they require is a Zod or Pydantic type definition and a prompt. No renting GPUs, labeling data and tuning hyperparameters necessary.

They often also improve LLM performance significantly. Imagine you're trying to extract calories per 100g of product, but some product give you calories per serving and a serving size, calories per pound etc. The naive way to do this is a prompt like "give me calories per 100g", but that forces the LLM to do arithmetic, and LLMs are bad at arithmetic. With structured outputs, you just give it the fifteen different formats that you expect to see as alternatives, and use some simple Python to turn them all into calories per 100g on the backend side.

nico•3h ago

Thank you YC for posting this before the talk became deprecated[1]

1: https://x.com/karpathy/status/1935077692258558443

sandslash•2h ago

We couldn't let that happen!

jppope•2h ago

Well that showed up significantly faster than they said it would.

seneca•1h ago

Classic under promise and over deliver.

I'm glad they got it out quickly.

dang•1h ago

Me too. It was my favorite talk of the ones I saw.

dang•1h ago

The team adapted quickly, which is a good sign. I believe getting the videos out sooner (as in why-not-immediately) is going to be a priority in the future.

anythingworks•2h ago

loved the analogies! Karpathy is consistently one of the clearest thinkers out there.

interesting that Waymo could do uninterrupted trips back in 2013, wonder what took them so long to expand? regulation? tailend of driving optimization issues?

noticed one of the slides had a cross over 'AGI 2027'... ai-2027.com :)

AlotOfReading•2h ago

You don't "solve" autonomous driving as such. There's a long, slow grind of gradually improving things until failures become rare enough.

petesergeant•2h ago

I wonder at what point all the self-driving code becomes replaceable with a multimodal generalist model with the prompt “drive safely”

AlotOfReading•2h ago

One of the issues with deploying models like that is the lack of clear, widely accepted ways to validate comprehensive safety and absence of unreasonable risk. If that can be solved, or regulators start accepting answers like "our software doesn't speed in over 95% of situations", then they'll become more common.

anon7000•1h ago

Very advanced machine learning models are used in current self driving cars. It all depends what the model is trying to accomplish. I have a hard time seeing a generalist prompt-based generative model ever beating a model specifically designed to drive cars. The models are just designed for different, specific purposes

tshaddox•26m ago

I could see it being the case that driving is a fairly general problem, and this models intentionally designed to be general end up doing better than models designed with the misconception that you need a very particular set of driving-specific capabilities.

anythingworks•5m ago

exactly! I think that was tesla's vision with self-driving to begin with... so they tried to frame it as problem general enough, that trying to solve it would also solve questions of more general intelligence ('agi') i.e. cars should use vision just like humans would

but in hindsight looks like this slowed them down quite a bit despite being early to the space...

ActorNightly•13m ago

> Karpathy is consistently one of the clearest thinkers out there.

Eh, he ran Teslas self driving division and put them into a direction that is never going to fully work.

What they should have done is a) trained a neural net to represent sequence of frames into a physical environment, and b)leveraged Mu Zero, so that self driving system basically builds out parallel simulations into the future, and does a search on the best course of action to take.

Because thats pretty much what makes humans great drivers. We don't need to know what a cone is - we internally compute that something that is an object on the road that we are driving towards is going to result in a negative outcome when we collide with it.

AIorNot•2h ago

Love his analogies and clear eyed picture

pyman•2h ago

"We're not building Iron Man robots. We're building Iron Man suits"

reducesuffering•2h ago

[flagged]

throwawayoldie•2h ago

I'm old enough to remember when Twitter was new, and for a moment it felt like the old utopian promise of the Internet finally fulfilled: ordinary people would be able to talk, one-on-one and unmediated, with other ordinary people across the world, and in the process we'd find out that we're all more similar than different and mainly want the same things out of life, leading to a new era of peace and empathy.

It was a nice feeling while it lasted.

_kb•1h ago

Believe it or not, humans did in fact have forms of written language and communication prior to twitter.

dang•1h ago

Can you please make your substantive points without snark? We're trying for something a bit different here.

https://news.ycombinator.com/newsguidelines.html

throwawayoldie•47m ago

You missed the point, but that's fine, it happens.

tock•47m ago

I believe the opposite happened. People found out that there are huge groups of people with wildly differing views on morality from them and that just encouraged more hate. I genuinely think old school facebook where people only interacted with their own private friend circles is better.

pryelluw•2h ago

Funny thing is that in more than one of the iron man movies the suits end up being bad robots. Even the ai iron man made shows up to ruin the day in the avengers movie. So it’s a little in the nose that they’d try to pitch it this way.

AdieuToLogic•2h ago

It's an interesting presentation, no doubt. The analogies eventually fail as analogies usually do.

A recurring theme presented, however, is that LLM's are somehow not controlled by the corporations which expose them as a service. The presenter made certain to identify three interested actors (governments, corporations, "regular people") and how LLM offerings are not controlled by governments. This is a bit disingenuous.

Also, the OS analogy doesn't make sense to me. Perhaps this is because I do not subscribe to LLM's having reasoning capabilities nor able to reliably provide services an OS-like system can be shown to provide.

A minor critique regarding the analogy equating LLM's to mainframes:

  Mainframes in the 1960's never "ran in the cloud" as it did
  not exist.  They still do not "run in the cloud" unless one
  includes simulators.

  Terminals in the 1960's - 1980's did not use networks.  They
  used dedicated serial cables or dial-up modems to connect
  either directly or through stat-mux concentrators.

  "Compute" was not "batched over users."  Mainframes either
  had jobs submitted and ran via operators (indirect execution)
  or supported multi-user time slicing (such as found in Unix).

furyofantares•1h ago

> The presenter made certain to identify three interested actors (governments, corporations, "regular people") and how LLM offerings are not controlled by governments. This is a bit disingenuous.

I don't think that's what he said, he was identifying the first customers and uses.

AdieuToLogic•55m ago

>> A recurring theme presented, however, is that LLM's are somehow not controlled by the corporations which expose them as a service. The presenter made certain to identify three interested actors (governments, corporations, "regular people") and how LLM offerings are not controlled by governments. This is a bit disingenuous.

> I don't think that's what he said, he was identifying the first customers and uses.

The portion of the presentation I am referencing starts at or near 12:50[0]. Here is what was said:

  I wrote about this one particular property that strikes me
  as very different this time around.  It's that LLM's like
  flip they flip the direction of technology diffusion that
  is usually present in technology.

  So for example with electricity, cryptography, computing,
  flight, internet, GPS, lots of new transformative that have
  not been around.

  Typically it is the government and corporations that are
  the first users because it's new expensive etc. and it only
  later diffuses to consumer.  But I feel like LLM's are kind
  of like flipped around.

  So maybe with early computers it was all about ballistics
  and military use, but with LLM's it's all about how do you
  boil an egg or something like that.  This is certainly like
  a lot of my use.  And so it's really fascinating to me that
  we have a new magical computer it's like helping me boil an
  egg.

  It's not helping the government do something really crazy
  like some military ballistics or some special technology.

Note the identification of historic government interest in computing along with a flippant "regular person" scenario in the context of "technology diffusion."

You are right in that the presenter identified "first customers", but this is mentioned in passing when viewed in context. Perhaps I should not have characterized this as "a recurring theme." Instead, a better categorization might be:

  The presenter minimized the control corporations have by
  keeping focus on governmental topics and trivial customer
  use-cases.

0 - https://youtu.be/LCEmiRjPEtQ?t=770

distalx•43m ago

Hang in there! Your comment makes some really good points about the limits of analogies and the real control corporations have over LLMs.

Plus, your historical corrections were spot on. Sometimes, good criticisms just get lost in the noise online. Don't let it get to you!

wjohn•1h ago

The comparison of our current methods of interacting with LLMs (back and forth text) to old-school terminals is pretty interesting. I think there's still a lot work to be done to optimize how we interact with these models, especially for non-dev consumers.

nodesocket•1h ago

llms.txt makes a lot of sense, especially for LLMs to interact with http APIs autonomously.

Seems like you could set a LLM loose and like the Google Bot have it start converting all html pages into llms.txt. Man, the future is crazy.

nothrabannosir•42m ago

Couldn’t believe my eyes. The www is truly bankrupt. If anyone has a browser plugin which automatically redirects to llms.txt sign me up.

Website too confusing for humans? Add more design, modals, newsletter pop ups, cookie banners, ads, …

Website too confusing for LLMs? Add an accessible, clean, ad-free, concise, high entropy, plain text summary of your website. Make sure to hide it from the humans!

PS: it should be /.well-known/llms.txt but that feels futile at this point..

PPS: I enjoyed the talk, thanks.

andrethegiant•34m ago

> If anyone has a browser plugin which automatically redirects to llms.txt sign me up.

Not a browser plugin, but you can prefix URLs with `pure.md/` to get the pure markdown of that page. It's not quite a 1:1 to llms.txt as it doesn't explain the entire domain, but works well for one-off pages. [disclaimer: I'm the maintainer]

practal•23m ago

If you have different representations of the same thing (llms.txt / HTML), how do you know it is actually equivalent to each other? I am wondering if there are scenarios where webpage publishers would be interested in gaming this.

dang•1h ago

This was my favorite talk at AISUS because it was so full of concrete insights I hadn't heard before and (even better) practical points about what to build now, in the immediate future. (To mention just one example: the "autonomy slider".)

If it were up to me, which it very much is not, I would try to optimize the next AISUS for more of this. I felt like I was getting smarter as the talk went on.

sneak•1h ago

Can we please stop standardizing on putting things in the root?

/.well-known/ exists for this purpose.

example.com/.well-known/llms.txt

https://en.m.wikipedia.org/wiki/Well-known_URI

andrethegiant•43m ago

https://github.com/AnswerDotAI/llms-txt/issues/2

mikewarot•8m ago

A few days ago, I was introduced to the idea that when you're vibe coding, you're consulting a "genie", much like in the fables, you almost never get what you asked for, but if your wishes are small, you might just get what you want.

fudged71•4m ago

“You are an expert 10x software developer. Make me a billion dollar app.” Yeah this checks out

Gurney Halleck from Dune tells you to do stuff

June 2025 C2PA News

This month in Servo: color inputs, SVG, embedder JavaScript, and more

Triaging security issues reported by third parties

SpaceX Ship 36 RUDs During Testing

Stinging Tree

Deepwiki

How My AI Free Commitment Challenge Is Going

The Megaproject Economy

Death to WYSIWYG!

Show HN: I created a data enrichment guide

Writing Toy Software Is a Joy

OpenAI wins $200M contract with US Military for 'warfighting'

Anthropic RSS Feeds

Hacker News Frontpage with Filtering

Juneteenth: History

Elliptic Curves as Art

Compliant-Mechanism Mattress for Preventing Pressure Ulcers [video]

Mysterious carving found in northern Ontario wilderness

Dev snapshot: Godot 4.5 beta 1

MAIR: A Benchmark for Evaluating Instructed Retrieval

Tell HN: Claude enthuses about my insights and brilliant ideas; I'm liking it

Privacy-Preserving Attribution: Level 1 Draft. Private Advertising Technology WG

CISA warns of attackers exploiting Linux flaw with PoC exploit

Mathematicians Hunting Prime Numbers Discover Infinite New Pattern

Dr. Demento Announces Retirement After 55-Year Radio Career

I feel open source has turned into two worlds

Austrian government agrees on plan to allow monitoring of secure messaging

Greg Egan's Home Page

An injectable HIV-prevention drug is highly effective – but expensive

Andrej Karpathy: Software in the era of AI [video]

Comments

Gurney Halleck from Dune tells you to do stuff

June 2025 C2PA News

This month in Servo: color inputs, SVG, embedder JavaScript, and more

Triaging security issues reported by third parties

SpaceX Ship 36 RUDs During Testing

Stinging Tree

Deepwiki

How My AI Free Commitment Challenge Is Going

The Megaproject Economy

Death to WYSIWYG!

Show HN: I created a data enrichment guide

Writing Toy Software Is a Joy

OpenAI wins $200M contract with US Military for 'warfighting'

Anthropic RSS Feeds

Hacker News Frontpage with Filtering

Juneteenth: History

Elliptic Curves as Art

Compliant-Mechanism Mattress for Preventing Pressure Ulcers [video]

Mysterious carving found in northern Ontario wilderness

Dev snapshot: Godot 4.5 beta 1

MAIR: A Benchmark for Evaluating Instructed Retrieval

Tell HN: Claude enthuses about my insights and brilliant ideas; I'm liking it

Privacy-Preserving Attribution: Level 1 Draft. Private Advertising Technology WG

CISA warns of attackers exploiting Linux flaw with PoC exploit

Mathematicians Hunting Prime Numbers Discover Infinite New Pattern

Dr. Demento Announces Retirement After 55-Year Radio Career

I feel open source has turned into two worlds

Austrian government agrees on plan to allow monitoring of secure messaging

Greg Egan's Home Page

An injectable HIV-prevention drug is highly effective – but expensive