frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: I built a tiny LLM to demystify how language models work

https://github.com/arman-bd/guppylm
100•armanified•3h ago
Built a ~9M param LLM from scratch to understand how they actually work. Vanilla transformer, 60K synthetic conversations, ~130 lines of PyTorch. Trains in 5 min on a free Colab T4. The fish thinks the meaning of life is food.

Fork it and swap the personality for your own character.

Comments

AndrewKemendo•1h ago
I love these kinds of educational implementations.

I want to really praise the (unintentional?) nod to Nagel, by limiting capabilities to representation of a fish, the user is immediately able to understand the constraints. It can only talk like a fish cause it’s very simple

Especially compared to public models, thats a really simple correspondence to grok intuitively (small LLM > only as verbose as a fish, larger LLM > more verbose) so kudos to the author for making that simple and fun.

dvt•58m ago
> the user is immediately able to understand the constraints

Nagel's point was quite literally the opposite[1] of this, though. We can't understand what it must "be like to be a bat" because their mental model is so fundamentally different than ours. So using all the human language tokens in the world can't get us to truly understand what it's like to be a bat, or a guppy, or whatever. In fact, Nagel's point is arguably even stronger: there's no possible mental mapping between the experience of a bat and the experience of a human.

[1] https://www.sas.upenn.edu/~cavitch/pdf-library/Nagel_Bat.pdf

AndrewKemendo•45m ago
Different argument

I’m not going to argue other than to say that you need to view the point from a third party perspective evaluating “fish” vs “more verbose thing,” such that the composition is the determinant of the complexity of interaction (which has a unique qualia per nagel)

Hence why it’s a “unintentional nod” not an instantiation

nullbyte808•1h ago
Adorable! Maybe a personality that speaks in emojis?
SilentM68•1h ago
Would have been funny if it were called "DORY" due to memory recall issues of the fish vs LLMs similar recall issues :)
ordinarily•26m ago
It's genuinely a great introduction to LLMs. I built my own awhile ago based off Milton's Paradise Lost: https://www.wvrk.org/works/milton
cbdevidal•10m ago
> you're my favorite big shape. my mouth are happy when you're here.

Laughed loudly :-D

xantronix•6m ago
I fucking hate LLMs as a matter of principle.

However.

I love this. It's so tiny. And cute. It's just a little guy.

Show HN: I built a tiny LLM to demystify how language models work

https://github.com/arman-bd/guppylm
101•armanified•3h ago•8 comments

Show HN: YouTube search barely works, I made a search form with advanced filters

https://playlists.at/youtube/search/
90•nevernothing•3h ago•64 comments

Show HN: Gemma Gem – AI model embedded in a browser – no API keys, no cloud

https://github.com/kessler/gemma-gem
18•ikessler•3h ago•1 comments

Show HN: Modo – I built an open-source alternative to Kiro, Cursor, and Windsurf

https://github.com/mohshomis/modo
17•mohshomis•3h ago•2 comments

Show HN: Mdarena – Benchmark your Claude.md against your own PRs

https://github.com/HudsonGri/mdarena
13•hudsongr•3h ago•1 comments

Show HN: jsoncompat – a library to detect/fuzz breaking changes in JSON schemas

https://jsoncompat.com/
3•rogaos•1h ago•0 comments

Show HN: A game where you build a GPU

https://jaso1024.com/mvidia/
900•Jaso1024•1d ago•179 comments

Show HN: OsintRadar – Curated directory for osint tools

https://osintradar.com/
67•lexalizer•21h ago•6 comments

Show HN: Contrapunk – Real-time counterpoint harmony from guitar input

https://contrapunk.com/
111•waveywaves•1d ago•47 comments

Show HN: M. C. Escher spiral in WebGL inspired by 3Blue1Brown

https://static.laszlokorte.de/escher/
156•laszlokorte•1d ago•24 comments

Show HN: Grug – Claude Code Skill Inspired by the Grug Brained Developer

https://github.com/replete/grug-skill
4•replete•5h ago•0 comments

Show HN: Runfra – Decentralized GPU cluster designed for bulk generation

https://runfra.com/playground
4•spencer9714•5h ago•1 comments

Show HN: I built a small app for FSI German Course

https://detawk.com/
47•syedmsawaid•3d ago•14 comments

Show HN: Where Is Artemis?

https://www.whereisartemis.com/
6•larsmoa•5h ago•0 comments

Show HN: ACE – A dynamic benchmark measuring the cost to break AI agents

https://fabraix.com/blog/adversarial-cost-to-exploit
7•zachdotai•5h ago•3 comments

Show HN: Orcastrate – Sync GitHub Actions workflows across repos via templates

https://github.com/michidk/orcastrate
4•michidk•6h ago•0 comments

Show HN: Genetic algorithm engine that evolves trading strategies

https://github.com/NeuZhou/finclaw
3•neuzhou•7h ago•0 comments

Show HN: I made open source, zero power PCB hackathon badges

https://github.com/KaiPereira/Overglade-Badges
153•kaipereira•1d ago•15 comments

Show HN: Arbory – Native iOS dashboard and widgets for Plausible Analytics

https://arbory.io/
3•jorijn•7h ago•0 comments

Show HN: sllm – Split a GPU node with other developers, unlimited tokens

https://sllm.cloud
179•jrandolf•1d ago•89 comments

Show HN: I built a frontpage for personal blogs

https://text.blogosphere.app/
768•ramkarthikk•2d ago•193 comments

Show HN: Ragot – a front end runtime built around lifecycle and ownership

https://github.com/BleedingXiko/RAGOT
2•BleedingXiko•8h ago•1 comments

Show HN: Apfel – The free AI already on your Mac

https://apfel.franzai.com
726•franze•2d ago•150 comments

Show HN: Fabro – open-source dark software factory

https://github.com/fabro-sh/fabro
3•brynary•8h ago•0 comments

Show HN: Sigil – A new programming language for AI agents

4•inerte•8h ago•0 comments

Show HN: I built a tool to show how much ARR you lose to FX fees

https://fixmyfx.com
4•TaniaBell_PD•9h ago•1 comments

Show HN: A Dad Joke Website

https://joshkurz.net/
7•joshkurz•9h ago•0 comments

Show HN: TurboQuant-WASM – Google's vector quantization in the browser

https://github.com/teamchong/turboquant-wasm
160•teamchong•1d ago•6 comments

Show HN: Beautiful intuitive weather forecasts that don't rely on numbers/units

https://weather-sense.leftium.com
3•Leftium•15h ago•8 comments

Show HN: Gecit – DPI bypass using eBPF sock_ops, no proxy or VPN

https://github.com/boratanrikulu/gecit
6•boratanrikulu•11h ago•1 comments