frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a tiny LLM to demystify how language models work

https://github.com/arman-bd/guppylm
121•armanified•3h ago
Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

Comments

AndrewKemendo•1h ago
I love these kinds of educational implementations.

I want to really praise the (unintentional?) nod to Nagel, by limiting capabilities to representation of a fish, the user is immediately able to understand the constraints. It can only talk like a fish cause it’s very simple

Especially compared to public models, thats a really simple correspondence to grok intuitively (small LLM > only as verbose as a fish, larger LLM > more verbose) so kudos to the author for making that simple and fun.

dvt•1h ago
> the user is immediately able to understand the constraints

Nagel's point was quite literally the opposite[1] of this, though. We can't understand what it must "be like to be a bat" because their mental model is so fundamentally different than ours. So using all the human language tokens in the world can't get us to truly understand what it's like to be a bat, or a guppy, or whatever. In fact, Nagel's point is arguably even stronger: there's no possible mental mapping between the experience of a bat and the experience of a human.

[1] https://www.sas.upenn.edu/~cavitch/pdf-library/Nagel_Bat.pdf

AndrewKemendo•58m ago
Different argument

I’m not going to argue other than to say that you need to view the point from a third party perspective evaluating “fish” vs “more verbose thing,” such that the composition is the determinant of the complexity of interaction (which has a unique qualia per nagel)

Hence why it’s a “unintentional nod” not an instantiation

nullbyte808•1h ago
Adorable! Maybe a personality that speaks in emojis?
SilentM68•1h ago
Would have been funny if it were called "DORY" due to memory recall issues of the fish vs LLMs similar recall issues :)
ordinarily•39m ago
It's genuinely a great introduction to LLMs. I built my own awhile ago based off Milton's Paradise Lost: https://www.wvrk.org/works/milton
cbdevidal•23m ago
> you're my favorite big shape. my mouth are happy when you're here.

Laughed loudly :-D

xantronix•19m ago
I fucking hate LLMs as a matter of principle.

However.

I love this. It's so tiny. And cute. It's just a little guy.

gnarlouse•4m ago
I... wow, you made an LLM that can actually tell jokes?
martmulx•3m ago
How much training data did you end up needing for the fish personality to feel coherent? Curious what the minimum viable dataset looks like for something like this.

Show HN: I built a tiny LLM to demystify how language models work

https://github.com/arman-bd/guppylm
124•armanified•3h ago•9 comments

Gemma 4 on iPhone

https://apps.apple.com/nl/app/google-ai-edge-gallery/id6749645337
470•janandonly•8h ago•127 comments

Show HN: YouTube search barely works, I made a search form with advanced filters

https://playlists.at/youtube/search/
99•nevernothing•3h ago•72 comments

LÖVE: 2D Game Framework for Lua

https://github.com/love2d/love
229•cl3misch•1d ago•91 comments

Copilot is 'for entertainment purposes only', per Microsoft's terms of use

https://techcrunch.com/2026/04/05/copilot-is-for-entertainment-purposes-only-according-to-microso...
37•airstrike•3h ago•11 comments

Microsoft hasn't had a coherent GUI strategy since Petzold

https://www.jsnover.com/blog/2026/03/13/microsoft-hasnt-had-a-coherent-gui-strategy-since-petzold/
257•naves•10h ago•142 comments

Artemis II crew see first glimpse of far side of Moon [video]

https://www.bbc.com/news/videos/ce3d5gkd2geo
437•mooreds•13h ago•335 comments

Eight years of wanting, three months of building with AI

https://lalitm.com/post/building-syntaqlite-ai/
667•brilee•14h ago•208 comments

Endian wars and anti-portability: this again?

https://dalmatian.life/2026/04/03/endian-wars-and-anti-portability-this-again/
23•awilfox•1d ago•22 comments

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

https://github.com/kessler/gemma-gem
23•ikessler•3h ago•1 comments

Running Gemma 4 locally with LM Studio's new headless CLI and Claude Code

https://ai.georgeliu.com/p/running-google-gemma-4-locally-with
212•vbtechguy•10h ago•55 comments

Employers use your personal data to figure out the lowest salary you'll accept

https://www.marketwatch.com/story/employers-are-using-your-personal-data-to-figure-out-the-lowest...
109•thisislife2•3h ago•42 comments

Sheets Spreadsheets in Your Terminal

https://github.com/maaslalani/sheets
17•_____k•1d ago•4 comments

In Japan, the robot isn't coming for your job; it's filling the one nobody wants

https://techcrunch.com/2026/04/05/japan-is-proving-experimental-physical-ai-is-ready-for-the-real...
137•rbanffy•5h ago•151 comments

Scientists mapped all the nerves of the clitoris for the first time

https://www.livescience.com/health/anatomy/scientists-mapped-all-the-nerves-of-the-clitoris-for-t...
16•01-_-•1d ago•2 comments

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

https://github.com/mohshomis/modo
18•mohshomis•3h ago•2 comments

Why Switzerland has 25 Gbit internet and America doesn't

https://sschueller.github.io/posts/the-free-market-lie/
283•sschueller•9h ago•227 comments

Music for Programming

https://musicforprogramming.net
120•merusame•9h ago•51 comments

OpenAI's fall from grace as investors race to Anthropic

https://www.latimes.com/business/story/2026-04-01/openais-shocking-fall-from-grace-as-investors-r...
108•1vuio0pswjnm7•4h ago•62 comments

The Mechanics of Steins Gate (2023) [pdf]

https://github.com/Votuko/steins-gate-mechanics/blob/main/The%20Mechanics%20of%20Steins%20Gate%20...
58•Ariarule•5h ago•9 comments

Recall – local multimodal semantic search for your files

https://github.com/aayu22809/Recall
12•patel_aayushya•3h ago•8 comments

Computational Physics (2nd Edition) (2025)

https://websites.umich.edu/~mejn/cp2/
111•teleforce•11h ago•17 comments

A tail-call interpreter in (nightly) Rust

https://www.mattkeeter.com/blog/2026-04-05-tailcall/
135•g0xA52A2A•12h ago•22 comments

Wavelets on Graphs via Spectral Graph Theory (2009)

https://arxiv.org/abs/0912.3848
30•dedalus•5d ago•2 comments

We replaced Node.js with Bun for 5x throughput

https://trigger.dev/blog/firebun
6•pier25•1h ago•0 comments

LLMs can't justify their answers–this CLI forces them to

https://wheat.grainulation.com/
5•volatilityfund•2h ago•2 comments

Nanocode: The best Claude Code that $200 can buy in pure JAX on TPUs

https://github.com/salmanmohammadi/nanocode/discussions/1
170•desideratum•13h ago•24 comments

Caveman: Why use many token when few token do trick

https://github.com/JuliusBrussee/caveman
718•tosh•18h ago•313 comments

Stamp It All Programs Must Report Their Version – Michael Stapelberg

https://michael.stapelberg.ch/posts/2026-04-05-stamp-it-all-programs-must-report-their-version/
4•gurjeet•1h ago•0 comments

Friendica – A Decentralized Social Network

https://friendi.ca/
138•janandonly•16h ago•50 comments