frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture

https://ynarwal.github.io/how-llms-work/
39•ynarwal__•3h ago
All content is based on Andrej Karpathy's "Intro to Large Language Models" lecture (youtube.com/watch?v=7xTGNNLPyMI). I downloaded the transcript and used Claude Code to generate the entire interactive site from it — single HTML file. I find it useful to revisit this content time to time.

Comments

learningToFly33•3h ago
I’ve had a look, and it’s very well explained! If you ever want to expand it, you could also add how embedded data is fed at the very final step for specific tasks, and how it can affect prediction results.
lukeholder•1h ago
Page keeps annoyingly scroll-jumping a few pixels on iOS safari
tbreschi•30m ago
Yeah that typing effect at the top (expanding the composer) seems to be the isssue
gushogg-blake•1h ago
I haven't found an explanation yet that answers a couple of seemingly basic questions about LLMs:

What does the input side of the neutral network look like? Is it enough bits to represent N tokens where N is the context size? How does it handle inputs that are shorter than the context size?

I think embedding is one of the more interesting concepts behind LLMs but most pages treat it as a side note. How does embedding treat tokens that can have vastly different meanings in different contexts - if the word "bank" were a single token, for example, how does embedding account for the fact that it can mean river bank or money bank? Do the elements of the vector point in both directions? And how exactly does embedding interact with the training and inference processes - does inference generate updated embeddings at any point or are they fixed at training time?

(Training vs inference time is another thing explanations are usually frustrating vague on)

Udo•7m ago
> What does the input side of the neutral network look like? Is it enough bits to represent N tokens where N is the context size?

Not quite. The raw text converted into IDs corresponding to tokens by the tokenizer. Each token maps onto a vector, via a so-called embedding lookup (I always thought the word choice embedding was weird, but it's a standard).

This vector is then augmented with further information, such as positional and relational information, which happens inside the model.

The context is not a bitfield of tokens. It's a collection of vectors that are annotated with additional information by the model. The context size of a model is a maximum usable sequence length, it's not a fixed input array.

> if the word "bank" were a single token, for example, how does embedding account for the fact that it can mean river bank or money bank? Do the elements of the vector point in both directions?

The vector mapped to "bank" sorts the token into a very high dimensional space that points at all kinds of areas. These mappings are unlabeled, they are learned relationships between concepts. So the embedding vector derived from the token "bank" achieves most of its semantic meaning contextually, by the model putting it into relation to its interpretation of the source text. This is part of the relational annotations I mentioned earlier.

Barbing•36m ago
Lefthand labels (like Introduction) can overlap over main text content on the right in the central panel - may be able to trigger by reducing window width.
PetitPrince•3m ago
Have you reread what was produced by Claude Code before publishing ? This thing in one of the first paragraph jumps out:

> you end up with about 44 terabytes — roughly what fits on a single hard drive

No normal person would think that 44 TB is a usual hard drive size (I don't think it even exists ? 32TB seems the max in my retailer of choice). I don't think it's wrong per se to use LLM to produce cool visualization, but this lack of proof reading doesn't inspire confidence (especially since the 44TB is displayed proheminently with a different color).

S. Korea police arrest man over AI image of runaway wolf that misled authorities

https://www.bbc.com/news/articles/c4gx1n0dl9no
20•giuliomagnifico•48m ago•5 comments

DeepSeek v4

https://api-docs.deepseek.com/
970•impact_sy•7h ago•626 comments

Spinel: Ruby AOT Native Compiler

https://github.com/matz/spinel
40•dluan•1h ago•9 comments

Composition Shouldn't be this Hard

https://www.cambra.dev/blog/announcement/
54•larelli•2h ago•33 comments

Why I Write (1946)

https://www.orwellfoundation.com/the-orwell-foundation/orwell/essays-and-other-works/why-i-write/
168•RyanShook•7h ago•44 comments

Show HN: How LLMs Work – Interactive visual guide based on Karpathy's lecture

https://ynarwal.github.io/how-llms-work/
40•ynarwal__•3h ago•7 comments

An update on recent Claude Code quality reports

https://www.anthropic.com/engineering/april-23-postmortem
744•mfiguiere•16h ago•556 comments

GPT-5.5

https://openai.com/index/introducing-gpt-5-5/
1375•rd•16h ago•904 comments

US special forces soldier arrested after allegedly winning $400k on Maduro raid

https://www.cnn.com/2026/04/23/politics/us-special-forces-soldier-arrested-maduro-raid-trade
256•nkrisc•12h ago•297 comments

Bitwarden CLI compromised in ongoing Checkmarx supply chain campaign

https://socket.dev/blog/bitwarden-cli-compromised
755•tosh•19h ago•365 comments

Show HN: Gova – The declarative GUI framework for Go

https://github.com/NV404/gova
31•aliezsid•3h ago•9 comments

Why Not Venus?

https://mceglowski.substack.com/p/why-not-venus
53•zdw•5h ago•30 comments

Familiarity is the enemy: On why Enterprise systems have failed for 60 years

https://felixbarbalet.com/familiarity-is-the-enemy/
45•adityaathalye•5h ago•18 comments

MeshCore development team splits over trademark dispute and AI-generated code

https://blog.meshcore.io/2026/04/23/the-split
222•wielebny•17h ago•118 comments

Show HN: Tolaria – Open-source macOS app to manage Markdown knowledge bases

https://github.com/refactoringhq/tolaria
205•lucaronin•12h ago•77 comments

Meta tells staff it will cut 10% of jobs

https://www.bloomberg.com/news/articles/2026-04-23/meta-tells-staff-it-will-cut-10-of-jobs-in-pus...
604•Vaslo•15h ago•584 comments

Habitual coffee intake shapes the microbiome, modifies physiology and cognition

https://www.nature.com/articles/s41467-026-71264-8
151•scubakid•6h ago•85 comments

Using the internet like it's 1999

https://joshblais.com/blog/using-the-internet-like-its-1999/
165•joshuablais•13h ago•101 comments

TorchTPU: Running PyTorch Natively on TPUs at Google Scale

https://developers.googleblog.com/torchtpu-running-pytorch-natively-on-tpus-at-google-scale/
137•mji•13h ago•9 comments

UK Biobank health data keeps ending up on GitHub

https://biobank.rocher.lc
130•Cynddl•20h ago•33 comments

Ubuntu 26.04

https://lwn.net/Articles/1069399/
218•lxst•5h ago•124 comments

My phone replaced a brass plug

https://drobinin.com/posts/my-phone-replaced-a-brass-plug/
141•valzevul•17h ago•32 comments

Show HN: Agent Vault – Open-source credential proxy and vault for agents

https://github.com/Infisical/agent-vault
107•dangtony98•1d ago•37 comments

A programmable watch you can actually wear

https://www.hackster.io/news/a-diy-watch-you-can-actually-wear-8f91c2dac682
189•sarusso•3d ago•88 comments

Show HN: Honker – Postgres NOTIFY/LISTEN Semantics for SQLite

https://github.com/russellromney/honker
261•russellthehippo•22h ago•65 comments

Astronomers find the edge of the Milky Way

https://skyandtelescope.org/astronomy-news/astronomers-find-the-edge-of-the-milky-way/
127•bookofjoe•15h ago•27 comments

Alberta startup sells no-tech tractors for half price

https://wheelfront.com/this-alberta-startup-sells-no-tech-tractors-for-half-price/
2197•Kaibeezy•1d ago•746 comments

Used La Marzocco machines are coveted by cafe owners and collectors

https://www.nytimes.com/2026/04/20/dining/la-marzocco-espresso-machine.html
71•mitchbob•3d ago•131 comments

Incident with multple GitHub services

https://www.githubstatus.com/incidents/myrbk7jvvs6p
248•bwannasek•17h ago•118 comments

Writing a C Compiler, in Zig (2025)

https://ar-ms.me/thoughts/c-compiler-1-zig/
163•tosh•1d ago•46 comments