Show HN: I built a tiny LLM to demystify how language models work

38•armanified•1h ago

Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

Comments

AndrewKemendo•20m ago

I love these kinds of educational implementations.

I want to really praise the (unintentional?) nod to Nagel, by limiting capabilities to representation of a fish, the user is immediately able to understand the constraints. It can only talk like a fish cause it’s very simple

Especially compared to public models, thats a really simple correspondence to grok intuitively (small LLM > only as verbose as a fish, larger LLM > more verbose) so kudos to the author for making that simple and fun.

nullbyte808•3m ago

Adorable! Maybe a personality that speaks in emojis?

Show HN: LLM Wiki – Open-Source Implementation of Karpathy's LLM Wiki

Show HN: The easiest way to generate AI stickers then get them in the mail

New Copilot for Windows 11 includes a full Microsoft Edge package, uses more RAM

Association between Covid-19 vaccination and sudden death in younger individuals

We replaced Node.js with Bun for 5x throughput

Show HN: jsoncompat – a library to detect/fuzz breaking changes in JSON schemas

Straight Lines on Graphs

Guesty Copilot: Open-source MCP server for Guesty property management

Show HN: Dot-Globe – React component that renders NASA night-light data

Debian Is Figuring Out How Age Verification Laws Will Impact It

Stamp It All Programs Must Report Their Version – Michael Stapelberg

Show HN: OnlyTech – For the Friday Deployments and More

Morgenruf – Free, open-source Slack standup bot(self-hosted)

Pokémon cards are igniting an international crime spree

Arabinoxylan-gluten hydrogels via enzymatic oxidation and regeneration

Riddle solved: Why was Roman concrete so durable?

End of an era: Samsung has killed Samsung Messages in favor of Google Messages

AI agents promise to 'run the business,' but who is liable if things go wrong?

Show HN: Prediction Hunt API – A unified layer for Polymarket, Kalshi, and more

"The Movies" Probably Aren't Coming Back

Usenet Archives

Run Gemma 4 in Your Browser

We analyzed 5,480 hospital cost reports. Supply spending varies 3-7x

Corridor Crew Is Changing Filmmaking Forever [video]

OpenCloud: Open-source, sovereign alternative to Google Drive, fork of OwnCloud

Cloud Codex – self-hosted real-time collaborative docs platform

Anthropic blocks cli calls mentioning OpenClaw

LLMs can't justify their answers–this CLI forces them to

Show HN: LLMs' Favorite Colors

Exploring NPM's Dependency Blast Radius: Visualization of the Top 1K