frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Show HN: Terminal UI for AWS

https://github.com/huseyinbabal/taws
229•huseyinbabal•8h ago•112 comments

California residents can now request all data brokers delete personal info

https://consumer.drop.privacy.ca.gov/
55•memalign•47m ago•9 comments

Lessons from 14 years at Google

https://addyosmani.com/blog/21-lessons/
954•cdrnsf•13h ago•434 comments

Why does a least squares fit appear to have a bias when applied to simple data?

https://stats.stackexchange.com/questions/674129/why-does-a-linear-least-squares-fit-appear-to-ha...
183•azeemba•8h ago•47 comments

During Helene, I just wanted a plain text website

https://sparkbox.com/foundry/helene_and_mobile_web_performance
65•CqtGLRGcukpy•2h ago•38 comments

The baffling purple honey found only in North Carolina

https://www.bbc.com/travel/article/20250417-the-baffling-purple-honey-found-only-in-north-carolina
28•rmason•4d ago•7 comments

The unbearable joy of sitting alone in a café

https://candost.blog/the-unbearable-joy-of-sitting-alone-in-a-cafe/
494•mooreds•14h ago•297 comments

Street Fighter II, the World Warrier (2021)

https://fabiensanglard.net/sf2_warrier/
333•birdculture•14h ago•57 comments

The year of the 3D printed miniature and other lies we tell ourselves

https://matduggan.com/the-year-of-the-3d-printed-miniature-and-other-lies-we-tell-ourselves/
131•sagacity•6d ago•80 comments

Linear Address Spaces: Unsafe at any speed (2022)

https://queue.acm.org/detail.cfm?id=3534854
129•nithssh•4d ago•89 comments

I charged $18k for a Static HTML Page (2019)

https://idiallo.com/blog/18000-dollars-static-web-page
195•caminanteblanco•2d ago•49 comments

Show HN: An interactive guide to how browsers work

https://howbrowserswork.com/
196•krasun•13h ago•31 comments

Eurostar AI vulnerability: When a chatbot goes off the rails

https://www.pentestpartners.com/security-blog/eurostar-ai-vulnerability-when-a-chatbot-goes-off-t...
107•speckx•7h ago•30 comments

Ripple, a puzzle game about 2nd and 3rd order effects

https://ripplegame.app/
95•mooreds•10h ago•25 comments

The Showa Hundred Year Problem

https://www.dampfkraft.com/showa-100.html
31•polm23•5d ago•10 comments

Millennium Challenge: A corrupted military exercise and its legacy (2015)

https://warontherocks.com/2015/11/millennium-challenge-the-real-story-of-a-corrupted-military-exe...
26•lifeisstillgood•5h ago•22 comments

Web development is fun again

https://ma.ttias.be/web-development-is-fun-again/
327•Mojah•13h ago•405 comments

Six Harmless Bugs Lead to Remote Code Execution

https://mehmetince.net/the-story-of-a-perfect-exploit-chain-six-bugs-that-looked-harmless-until-t...
34•ozirus•3d ago•3 comments

How to translate a ROM: The mysteries of the game cartridge [video]

https://www.youtube.com/watch?v=XDg73E1n5-g
4•zdw•5d ago•0 comments

Agentic Patterns

https://github.com/nibzard/awesome-agentic-patterns
91•PretzelFisch•9h ago•11 comments

The great shift of English prose

https://www.worksinprogress.news/p/english-prose-has-become-much-easier
43•dsubburam•4d ago•30 comments

Bison return to Illinois' Kane County after 200 years

https://phys.org/news/2025-12-bison-illinois-kane-county-years.html
134•bikenaga•5d ago•40 comments

Moiré Explorer

https://play.ertdfgcvb.xyz/#/src/demos/moire_explorer
140•Luc•15h ago•18 comments

Show HN: An LLM-Powered PCB Schematic Checker (Major Update)

https://traceformer.io/
36•wafflesfreak•7h ago•15 comments

Show HN: Hover – IDE style hover documentation on any webpage

https://github.com/Sampsoon/hover
43•sampsonj•10h ago•18 comments

Anti-aging injection regrows knee cartilage and prevents arthritis

https://scitechdaily.com/anti-aging-injection-regrows-knee-cartilage-and-prevents-arthritis/
230•nis0s•13h ago•85 comments

FreeBSD Home NAS, part 3: WireGuard VPN, routing, and Linux peers

https://rtfm.co.ua/en/freebsd-home-nas-part-3-wireguard-vpn-linux-peer-and-routing/
151•todsacerdoti•16h ago•8 comments

Trellis AI (YC W24) is hiring engineers to build AI agents for healthcare access

https://www.ycombinator.com/companies/trellis-ai/jobs/ngvfeaq-member-of-technical-staff-full-time
1•macklinkachorn•11h ago

Claude Code On-the-Go

https://granda.org/en/2026/01/02/claude-code-on-the-go/
240•todsacerdoti•8h ago•167 comments

Using Hinge as a Command and Control Server

https://mattwie.se/hinge-command-control-c2
97•mattwiese•14h ago•46 comments
Open in hackernews

Neural Networks: Zero to Hero

https://karpathy.ai/zero-to-hero.html
715•suioir•23h ago

Comments

suioir•23h ago
I saw this on a comment [0] and thought it deserved a post.

[0] https://news.ycombinator.com/item?id=46483776

butanyways•17h ago
maybe we can create one ourselves. Posted this a few days ago here. A decent read: https://zekcrates.quarto.pub/deep-learning-library/
m-hodges•22h ago
A couple years ago I wrote a tutorial how to build a Neural Network in NumPy from scratch.¹

¹ https://matthodges.com/posts/2022-08-06-neural-network-from-...

bariswheel•22h ago
This new? Hasn't the zero-to-hero course been around for a while?
fragmede•21h ago
https://xkcd.com/1053/
sh3rl0ck•20h ago
Is it weird that I now know exactly which xkcd it will be just with conversational context?

Granted I'm a bit of a Randall Munroe content addict, but it's become second nature now.

messe•20h ago
You're not alone. At this point I'm starting to recognise some by number as well.
ojo-rojo•19h ago
Ha, you made me think of casually referring to xkcd's by number just as we did with RFC's back in the day. "I don't know, the socket states seem to follow RFC 793, but remember it's a 1918 address on the southside of the NAT."

I gonna keep a look out for doing this with xkcd's now :)

fragmede•19h ago
There are a few that pop out but the one that has managed to stick (aside from 1053 that just came up), is 927 for standards, which you can remember as 3^2 for 9 and 3^3 for 27. Or Yoda's age + the 27 club.
jll29•18h ago
Communicating the number of XKCD comics, especially in binary, is a very efficient and energy-preserving way to get a laugh.

A: 10000011101 !

B: ACK. LOL !

amenhotep•15h ago
A newly convicted criminal arrived in prison, and on the first night he was puzzled to hear his fellow inmates yelling numbers to each other. "36!" one would yell, and the rest would chuckle. "19!" went another, to uproarious laughter. "50," remarked a third wryly, which provoked groans and ironic cheers. Eventually his cellmate sat up and cried out "114" and it brought the house down.

In a lull, he asked his cellmate what on earth was going on? The cellmate explained that most of them had been in prison so long that they already knew all the jokes, so to save time they just referred to them by number. "Oh," says the man, "that makes sense. Can I try?"

His cellmate encouraged him to go ahead, so he stood up and went to the bars and shouted as loud as he could "95!"

Absolutely no reaction. His cellmate looked at him and shook his head. "You didn't tell it right."

tzot•11h ago
And some time later, someone shouts “72!” Everyone chuckles except from the one in the corner cell, who laughs so loud and for so long people think he'll have a heart attack. When eventually he stops laughing, someone yells: “Hey Fred, why did you laugh so much?” “I'd never heard that one!”
improbableinf•20h ago
So you are not the part of a lucky 10,000 today…
pylotlight•19h ago
I feel like the same top 5~ are often repeated so it becomes easy to guess.
Marciplan•18h ago
I think, in the spirit of the xkcd, you were supposed to pretend you have never heard of it
OJFord•17h ago
I know exactly what you mean. It broke my workflow too.
rsanek•21h ago
should have a (2022) label
apetrov•18h ago
it's an ongoing project, the last lecture is about i year old
mcapodici•21h ago
A bit of shameless plug, I wrote 2 articles about this after doing the course a while ago.

https://martincapodici.com/2023/07/15/no-local-gpu-no-proble...

https://martincapodici.com/2023/07/19/modal-com-and-nanogpt-...

Flere-Imsaho•20h ago
I'm not sure how it compares, but another option is the Hugging Face learning portal [0]. I'm doing the Deep RL Course and so far it's pretty straight forward (although when it gets math heavy I'm going to suffer).

[0] - https://huggingface.co/learn

canpan•17h ago
I found the Karpathy videos very approachable. While I did study CS, I never went deep into ML. My main knowledge about matrices is for graphic development, so vectors and matrices up to 4x4 in size only. But following the videos, starting to learn about backprob and building the tiny GPT was understandable to me.

Karpathy's lessons are great to really grog the background and underlying basics. They do not go into the many libraries available, the course you link might be more practically applicable.

BinaryMachine•11h ago
Meh, I took a couple Hugging Face courses, I might not take them again.

The grading system forces you to write specifically to pass their LLM grading system, terrible design. Maybe its gotten better I had to constantly look up how to write the correct answer just to pass their automatic grading system. Not a good way to learn and time wasted.

Karpathy videos posted here are GOLD.

webdevver•19h ago
its fun seeing HN articles with huge upvotes but no comments, similar to when some super esoteric maths gets posted: everyone upvotes out of a common understanding of its genius, but indeed by virtue of its genius most of us are not sufficiently cognitively gifted to provide any meaningful commentary.

the karpathy vids are very cool but having watched it, for me the takeaway was "i had better leave this for the clever guys". thankfully digital carpentry and plumbing is still in demand, for now!

larodi•17h ago
everyone understood overnight what vibe-coding was, but only dared to go through the looking mirror and try to grok what the mirror is made of.
apetrov•17h ago
actually it's quite the opposite: lectures are as approachable as one could possible make them, no fancy math and a walkthrough over attention is all you need
pbd•19h ago
what next now tho? i co-incidentally completed watching his last vid of training up gpt-2 today :-) .
butanyways•17h ago
maybe creating a simple "pytorch like library" and training models using that? No?
kirurik•18h ago
Saving this
esafak•12h ago
You just click 'favorite' and it appears in https://news.ycombinator.com/favorites?id=kirurik
cube2222•18h ago
I’ve gone through this series of videos earlier this year.

In the past I’ve gone through many “educational resources” about deep neural networks - books, coursera courses (yeah, that one), a university class, the fastai course - but I don’t work with them at all in my day to day.

This series of videos was by far the best, most “intuition building”, highest signal-to-noise ratio, and least “annoying” content to get through. Could of course be that his way of teaching just clicks with me, but in general - very strong recommend. It’s the primary resource I now recommend when someone wants to get into lower level details of DNNs.

3abiton•15h ago
Karpathy has a great intuitive style, but sometimes it's too dumbed down. If you come from adjacent fields, it might be a bit dragging, but it's always entertaining
ronbenton•15h ago
>Karpathy has a great intuitive style, but sometimes it's too dumbed down

As someone who has tried some teaching in the past, it's basically impossible to teach to an audience with a wide array of experience and knowledge. I think you need to define your intended audience as narrowly as possible, teach them, and just accept that more knowledgeable folk may be bored and less knowledgeable folk may be lost.

miki123211•6h ago
I think this is where LLM-assisted education is going to shine.

An LLM is the perfect tool to fill the little gaps that you need to fill to understand that one explanation that's almost at your level, but not quite.

mlmonkey•5h ago
When I was an instructor for courses like "Intro to Programming", this was definitely the case. The students ranged from "have never programmed before" to "I've been writing games in my spare time", but because it was a prerequisite for other courses, they all had to do it.

Teaching the class was a pain in the ass! What seemed to work was to do the intro stuff, and periodically throw a bone to the smartasses. Once I had them on my side, it became smooth sailing.

lazarus01•17h ago
I like Karpathy, we come from the same lineage and I am very proud of him for what he's accomplished, he's a very impressive guy.

In regards to deep learning, building deep learning architecture is one of my greatest joys in finding insights from perceptual data. Right now, I'm working on spatiotemporal data modeling to build prediction systems for urban planning to improve public transportation systems. I build ML infrastructure too and plan to release an app that deploys the model in the wild within event streams of transit systems.

It took me a month to master the basics and I've spent a lot of time with online learning, with Deeplearning.ai and skills.google. Deeplearning.ai is ok, but I felt the concepts a bit dated. The ML path at skills.google is excellent and gives a practical understanding of ML infrastructure, optimization and how to work with gpus and tpus (15x faster than gpus).

But the best source of learning for me personally and makes me a confident practitioner is the book by Francois Chollet, the creator of Keras. His book, "Deep Learning with Python", really removed any ambiguity I've had about deep learning and AI in general. Francois is extremely generous in how he explains how deep learning works, over the backdrop of 70 years of deep learning research. Francois keeps it updated and the third revision was made in September 2025 - its available online for free if you don't want to pay for it. He gives you the recipe for building a GPT and Diffusion models, but starts from the ground floor basics of tensor operations and computation graphs. I would go through it again from start to finish, it is so well written and enjoyable to follow.

The most important lesson he discusses is that "Deep learning is more of an art than a science". To get something working takes a good amount of practice and the results on how things work can't always be explained.

He includes notebooks with detailed code examples with Tensorflow, Pytorch and Jax as back ends.

Deep learning is a great skill to have. After reading this book, I can recreate scientific abstracts and deploy the models into production systems. I am very grateful to have these skills and I encourage anyone with deep curiosity like me to go all in on deep learning.

nemil_zola•16h ago
The project you mentioned you are working sounds interesting. Do you have more to share ?

I’m curious how ML/AI is leveraged in the domain of public transport. And what can it offer when compared to agent based models.

lazarus01•16h ago
The project I’m working on emulates a scientific abstract. I’m not a scientist by any means, but am adapting an abstract to the public transit system in NYC. I will publish the project on my website when it’s done. I think it’s a few weeks away. I built the dataset, now doing experimental model training. If I can get acceptable accuracy, I will deploy in a production system and build a UI.

Here is a scientific abstract that inspired my to start building this system. -> https://arxiv.org/html/2510.03121

I am unfamiliar with agent based models, sorry I can’t offer any personal insight there, but I ran your question through Gemini and here is the AI response:

Based on the scientific abstract of the paper *"Real Time Headway Predictions in Urban Rail Systems and Implications for Service Control: A Deep Learning Approach"* (arXiv:2510.03121), agent-based models (ABMs) and deep learning (DL) approaches compare as follows:

### 1. Computational Efficiency and Real-Time Application

* *Deep Learning (DL):* The paper proposes a *ConvLSTM* (Convolutional Long Short-Term Memory) framework designed for high computational efficiency. It is specifically intended to provide real-time predictions, enabling dispatchers to evaluate operational decisions instantly. * *Agent-Based Models (ABM):* While the paper does not use ABMs, it contrasts its DL approach with traditional *"computationally intensive simulations"*—a category that includes microscopic agent-based models. ABMs often require significant processing time to simulate individual train and passenger interactions, making them less suitable for immediate, real-time dispatching decisions during operations.

### 2. Modeling Methodology

* *Deep Learning (DL):* The approach is *data-driven*, learning spatiotemporal patterns and the propagation of train headways from historical datasets. It captures spatial dependencies (between stations) and temporal evolution (over time) through convolutional filters and memory states without needing explicit rules for train behavior. * *Agent-Based Models (ABM):* These are typically *rule-based and bottom-up*, modeling the movement of each train "agent" based on signaling rules, spacing, and train-following logic. While highly detailed, they require precise calibration of individual agent parameters.

### 3. Handling Operational Control

* *Deep Learning (DL):* A key innovation in this paper is the direct integration of *target terminal headways* (dispatcher decisions) as inputs. This allows the model to predict the downstream impacts of a specific control action (like holding a train) by processing it as a data feature. * *Agent-Based Models (ABM):* To evaluate a dispatcher's decision in an ABM, the entire simulation must typically be re-run with new parameters for the affected agents, which is time-consuming and difficult to scale across an entire metro line in real-time.

### 4. Use Case Scenarios

* *Deep Learning (DL):* Optimized for *proactive operational control* and real-time decision-making. It is most effective when large amounts of historical tracking data are available to train the spatiotemporal relationships. * *Agent-Based Models (ABM):* Often preferred for *off-line evaluation* of complex infrastructure changes, bottleneck mitigation strategies, or microscopic safety analyses where the "why" behind individual train behavior is more important than prediction speed.

zingar•17h ago
I have lots of non-AI software experience but nothing with AI (apart from using LLMs like everyone else). Also I did an introductory university course in AI 20 years ago that I’ve completely forgotten.

Where do I get to if I go through this material?

Enough to build… what? Or contribute on… ? Enough knowledge to have useful conversations on …? Enough knowledge to understand where to … is useful and why?

Where are the limits, what is it that the AI researchers have that this wouldn’t give?

p1esk•10h ago
Strange question. If you don’t know why you need this, you probably don’t. It will be the same as with the introductory AI course you did 20 years ago.
HarHarVeryFunny•8h ago
Well, no ... For a start any "AI" course 20 years ago probably wouldn't have even mentioned neural nets, and certainly not as a mainstream technique.

A 20yr old "AI" curriculum would have looked more like the 3rd edition of Russel & Norvig's "Artificial Intelligence - A Modern Approach".

https://github.com/yanshengjia/ml-road/blob/master/resources...

Karpathy's videos aren't an AI (except in modern sense of AI=LLMs) course, or a machine learning course, or even a neural network course for that matter (despite the title) - it's really just "From Zero to LLMs".

ruraljuror•5h ago
I think they meant the result— not the content—would be the same.
eps•5h ago
Neural nets were taught in my Uni in the late 90s. They were presented as the AI technique, which was however computationally infeasible at the time. Moreover, it was clearly stated that all supporting ideas were developed and researched 20 years prior, and the field was basically stagnated due to hardware not being there.
CamperBob2•8m ago
Anyone who watches the videos and follows along will indeed come up to speed on the basics of neural nets, at least with respect to MLPs. It's an excellent introduction.
baxuz•17h ago
A bit of a tangential topic — what would you recommend to someone who wants to get into computer vision and 3D (NERFs, photogrammetry, 3DGS etc)?

For someone who has a middling amount of math knowledge, what would you recommend?

I went to uni 15y ago, but only had "proper" math in the first 2 semesters, let's says something akin to Calculus 1 and Linear Algebra 1. Hated math back then, plus I had horrible habits.

a_r41•15h ago
I've been working in the novel view synthesis domain since 2019 and I would recommend starting with "nerfstudio". The documentation does a good job of explaining all the components involved (from dataset to final learned representation), the code is readable and it's relatively simple to set up and run. I think it's a nice place to start from before diving deeper into the latest that is going on in the 3D space.
jaccola•14h ago
For learning 3dgs (and its derivatives) I would recommend grabbing the original 3d Gaussian Splatting paper + repository and going through it and using an LLM to ask many questions.

LLMs aren't that great at explaining concepts a lot of the time so when you get stuck there, google around and learn that subtopic. E.g. you will come across "Jacobian" that you may or may not have seen before, but you can search Youtube and find a great Khan Academy/3b1b collab explaining it.

Get the code running also, play around with parameters, try to implement the whole thing from scratch, making sure you intuitively understand each part with the above method.

Obviously time scales vary for everyone, that having been said: I'd guess if you have a decent technical background, are OK feeling uncomfortable with the maths for a while (it is all understandable after a bit of pain), and are willing to keep plugging for a few hours a day you will have a very decent understanding in 6mo, and probably be "cutting edge" in a year or so (obviously the learning never ends, it is an active area of research after all!)

chronicler•15h ago
I don't even have enough knowledge to grasp the first video. Is there a list of knowledge requirements to look at?
jsight•14h ago
3blue1brown videos are great if you want to go deep on the math behind it.

If you are struggling with the neural network mechanics themselves, though, I'd recommend just skimming them once and then going back for a second watch later. The high level overview will make some of the early setup work make much more sense in a second viewing.

HarHarVeryFunny•10h ago
IMO that's a bit of a strange video for Karpathy to start with, perhaps even to include at all.

Let me explain why ...

Neural nets are trained by giving them lots of example inputs and outputs (the training data) and incrementally tweaking their initially random weights until they do better and better at matching these desired outputs. The way this is done is by expressing the difference between the desired and current (during training) outputs as an error function, parameterized by the weights, and finding the values of the weights that correspond to the minimum value of this error function (minimum errors = fully trained network!).

The way the minimum of the error function is found is simply by following its gradient (slope) downhill until you can't go down any more, which is hopefully the global minimum. This requires that you have the gradient (derivative) of the error function available so you know what direction (+/-) to tweak each of the weights to go in the downhill error direction, which will bring us to Karpathy's video ...

Neural nets are mostly built out of lego-like building blocks - individual functions (sometimes called nodes, or layers) that are chained/connected together to incrementally transform the neural network's input into it's output. You can then consider the entire neural net as a single giant function outputs = f(inputs, weights), and from this network function you can create the error function needed to train it.

One way to create the derivative of the network/error function is to use the "chain rule" of calculus to derive the combined derivative of all these chained functions from their own individual pre-defined derivative functions. This is the way that most machine learning frameworks, such as TensorFlow, and the original Torch (pre-PyTorch) worked. If you were using a machine learning framework like this then you would not need Karpathy's video to understand how it is working under the hood (if indeed that is something you care about at all!).

The alternative, PyTorch way, of deriving the derivative of the neural network function, is more flexible, and doesn't require you to build the network just out of nodes/layers that you already have the derivative functions for. The way PyTorch works is to let you just use regular Python code to define your neural network function, then record this python code as it runs to capture what it is doing as the definition of neural network function. Given this dynamically created neural network function, PyTorch (and other similar machine learning frameworks) then uses a built-in "autograd" (automatic gradient) capability to automatically create the derivative (gradient) of your network function, without someone having had to do that manually, as was the case for each of the lego building blocks in the old approach.

What that first video of Karpathy's is explaining is how this "autograd" capability works, which would help you build your own machine learning framework if you wanted to, or at least understand how PyTorch is working under the hood to create the network/error function derivative for you, that it will be using to train the weights. I'm sure many PyTorch users happily use it without caring how it's working under the hood, just as most developers happily use compilers without caring about exactly how they are working. If all you care about is understanding generally what PyTorch is doing under the hood, then this post may be enough!

For an introduction to machine learning, including neural networks, that assumes no prior knowledge other than hopefully being able to program a bit in some language, I'd recommend Andrew Ng's Introduction to ML courses on Coursera. He's modernized this course over the years, so I can't speak for the latest version, but he is a great educator and I trust that the current version is just as good as his old one that was my intro to ML (building neural nets just using MATLAB rather than using any framework!).

nobodyistaken•15h ago
This is great, but if I'm starting ML from scratch, what would you recommend? I'm coming from a webdev background and have used LLMs but nothing about ML, might even need the refresher on math, I think.
lazarus01•15h ago
https://deeplearningwithpython.io/
nobodyistaken•14h ago
Is it wise to start to with deep learning without knowing machine learning?
lazarus01•13h ago
That's a great question. Machine Learning is the overarching space where deep learning is a subspace of machine learning. So if you grasp some basic concepts of machine learning, then you can apply them to deep learning.

All the exciting innovation over the past 13 years comes from deep learning mainly in working with images and natural language.

Machine learning is good for tabular data problems, particularly decision trees, that work well to reduce uncertainty for business outcomes, like sales and marketing as one example.

Machine Learning Basics:

Linear regression - Y = Mx + B (predicts a future value) Classification (logistic regression) - Y = 1 / 1 + e^-(b0 + b1x) (predicts probability of a class or future event)

There is a common learning process between the two called gradient descent. It starts with the loss function, that measures the error between predictions and ground truth, where you backpropogate the errors as a feedback signal to update the learned weights which are the parameters of your ml model which is a more meaningful representation of your dataset that you train on.

In deep learning it's more appropriate for perception problems, like vision ,language and time sequences. It gets more complex where you are dealing with significantly more parameters in the millions, that are organized in hierarchical layer representation.

There are different layers for different types of learning representation, Convolutions for Images and RNN for Sequence to Sequence learning and many more examples of layers, which are the basis of all deep learning models.

So there is a small conceptual overlap; but I would say deep learning has a wider variety of interesting applications, is much more challenging to learn, but not impossible by any stretch.

There is no harm in giving it a try and diving in. If you get lost and drown in complexity, start with machine learning. It took me 3 years to grasp, so it's a marathon, not a sprint.

Hope this helps

meken•15h ago
Has anyone gone through cs231n and this as well?

I went through the former and it was one of the best classes I’ve ever taken. But I’ve been procrastinating on going through this because it seems like there’s a lot of overlap and the benefit seems marginal (I guess transformers are covered here?).

misiti3780•13h ago
I just finished this series and found it very useful. Especially the back-propagation lectures.
lfliosdjf•12h ago
I wish Karpathy's star fleet academy becomes a huge success.
ed4bb9fb7c•11h ago
Is there a text tutorial of this approach building NN from scratch? As a dad I simply don’t have a chance to watch this. Also maybe something for more math inclined? (MS in math) Deep learning in python that is recommended in other comments is way too basic and slow and hand wavy imo.
npalli•11h ago
This is a good resource, however for about 99.99% of people, you are most likely to just use a foundation model like ChatGPT, Claude, Gemini etc. so this knowledge/training will get you neither here or there. I would suggest you look into another Karpathy's video -- Deep Dive into LLMs like ChatGPT.

https://www.youtube.com/watch?v=7xTGNNLPyMI

kamranjon•10h ago
"Prerequisites: ... intro-level math (e.g. derivative, gaussian)"

Anyone got recommendations for learning resources for this type of math? Realizing now that I might be a bit behind on my intro-level math.

chandureddyvari•10h ago
3b1b yt channel calculus & LA

https://explained.ai/matrix-calculus/

khan academy - Multivariable Calculus course by Grant Sanderson(3b1b fame)

nickpsecurity•10h ago
Coursera and Udemy have Math for Machine Learning Courses. Udemy is self-paced. If you need, you can pause to learn an unforseen prerequisite.

I bought John Krohn's Mathematical Foundations and Krista King's Statistics and Probability.

cjamsonhn•10h ago
Highly recommend this as well. Does a great job of helping you build intuition for why things like gradient descent and normalization work. Also gets into the weeds on training dynamics and how to ensure they are behaving properly
m3kw9•8h ago
Does learning this still matter now?
shwaj•8h ago
Matter to who? If you want to deeply understand how this technology works, this is still relevant. If you want to vibe code, maybe not.
lazarus01•6h ago
Yes, the current technology cannot replace an engineer.

The easiest way to understand why is by understanding natural language. A natural language like english is very messy and and doesn't follow formal rules. It's also not specific enough to provide instructions to a computer, that's why code was created.

The AI is incredibly dumb when it comes to complex tasks with long range contexts. It needs an engineer that understands how to write and execute code to give it precise instructions or it is useless.

Natural Language Processing is so complex, it started around the end of world war two and we are just now seeing innovation in AI where we can mimmick humans, where the AI can do certain things faster than humans. But thinking is not one of them.

CamperBob2•4m ago
LOL. Figuring out how to solve IMO-level math problems without "thinking" would be even more impressive than thinking itself. Now there's a parrot I'd buy.