frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB

https://github.com/HarryR/z80ai
92•quesomaster9000•3h ago
How small can a language model be while still doing something useful? I wanted to find out, and had some spare time over the holidays.

Z80-μLM is a character-level language model with 2-bit quantized weights ({-2,-1,0,+1}) that runs on a Z80 with 64KB RAM. The entire thing: inference, weights, chat UI, it all fits in a 40KB .COM file that you can run in a CP/M emulator and hopefully even real hardware!

It won't write your emails, but it can be trained to play a stripped down version of 20 Questions, and is sometimes able to maintain the illusion of having simple but terse conversations with a distinct personality.

--

The extreme constraints nerd-sniped me and forced interesting trade-offs: trigram hashing (typo-tolerant, loses word order), 16-bit integer math, and some careful massaging of the training data meant I could keep the examples 'interesting'.

The key was quantization-aware training that accurately models the inference code limitations. The training loop runs both float and integer-quantized forward passes in parallel, scoring the model on how well its knowledge survives quantization. The weights are progressively pushed toward the 2-bit grid using straight-through estimators, with overflow penalties matching the Z80's 16-bit accumulator limits. By the end of training, the model has already adapted to its constraints, so no post-hoc quantization collapse.

Eventually I ended up spending a few dollars on Claude API to generate 20 questions data (see examples/guess/GUESS.COM), I hope Anthropic won't send me a C&D for distilling their model against the ToS ;P

But anyway, happy code-golf season everybody :)

Comments

Zee2•2h ago
This is super cool. Would love to see a Z80 simulator set up with these examples to play with!
jasonjmcghee•1h ago
For future projects and/or for this project, there are many LLMs available more than good enough to generate that kind of synthetic data (20 Qs) with permissive terms of use. (So you don’t need to stress about breaking TOS / C&D etc)
codetiger•1h ago
Imagine, this working on a Gameboy, in those days. Would've sounded like magic
alfiedotwtf•1h ago
And would have lasted 3 minutes.

Speaking of - I remember my first digital camera (Fujitsu 1Mb resolution using SmartMedia)… it used so much power that you could take 20-30 photos and then needed to replace all 4 batteries lol

Sharlin•1h ago
I don’t think this could beat an ELIZA-style bot in how magical it feels, given the extreme terseness of its replies.
lodovic•1h ago
I love these thought experiments. Looking at the code size, it would have been possible for someone to come up with this back in the days, similar to the idea of a million monkeys on a typewriter eventually producing Shakespeare.
alfiedotwtf•1h ago
An LLM in a .com file? Haha made my day
teaearlgraycold•39m ago
SLM
quesomaster9000•35m ago
All the 'Small' language models and the 'TinyML' scene in general tend to bottom out at a million parameters, hence I though 'micro' is more apt at ~150k params.
roygbiv2•1h ago
Awesome. I've just designed and built my own z80 computer, though right now it has 32kb ROM and 32kb RAM. This will definitely change on the next revision so I'll be sure to try it out.
wewewedxfgdf•1h ago
RAM is very expensive right now.
tgv•41m ago
We're talking kilobytes, not gigabytes. And it isn't DDR5 either.
vedmakk•1h ago
If one would train an actual secret (e.g. a passphrase) into such a model, that a user would need to guess by asking the right questions. Could this secret be easily reverse engineered / inferred by having access to models weights - or would it be safe to assume that one could only get to the secret by asking the right questions?
Kiboneu•43m ago
I don’t know, but your question reminds me of this paper which seems to address it on a lower level: https://arxiv.org/abs/2204.06974

“Planting Undetectable Backdoors in Machine Learning Models”

“ … On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate "backdoor key", the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees. …”

ronsor•42m ago
> this secret be easily reverse engineered / inferred by having access to models weights

It could with a network this small. More generally this falls under "interpretability."

nineteen999•1h ago
This couldn't be more perfectly timed .. I have an Unreal Engine game with both VT100 terminals (for running coding agents) and Z80 emulators, and a serial bridge that allows coding agents to program the CP/M machines:

https://i.imgur.com/6TRe1NE.png

Thank you for posting! It's unbelievable how someone sometimes just drops something that fits right into what you're doing. However bizarre it seems.

quesomaster9000•26m ago
Oh dear, it seems we've... somehow been psychically linked...

I developed a browser-based CP/M emulator & IDE: https://lockboot.github.io/desktop/

I was going to post that instead, but wanted a 'cool demo' instead, and fell down the rabbit hole.

sixtyj•7m ago
Connections: Alternative History of Technology by James Burke documents these "coincidences".
Dwedit•1h ago
In before AI companies buy up all the Z80s and raise the prices to new heights.
pdyc•1h ago
interesting, i am wondering how far can it go if we remove some of these limitations but try to solve some extremely specific problem like generating regex based on user input? i know small models(270M range) can do that but can it be done in say < 10MB range?
Waterluvian•28m ago
Generate an LLM that is designed to solve one extremely specific problem: answering the ultimate question of life, the universe, and everything.

Even with modern supercomputing the computation would be outpaced by the heat death of the universe, so token output must be limited to a single integer.

dirkt•1h ago
Eliza's granddaughter.
a_t48•51m ago
Nice - that will fit on a Gameboy cartridge, though bank switching might make it super terrible to run. Each bank is only 16k. You can have a bunch of them, but you can only access one bank at a time (well, technically two - bank 0 is IIRC always accessible).
magicalhippo•21m ago
As far as I know, the last layer is very quantization-sensitive, and is typically not quantized, or quantized lightly.

Have you experimented with having it less quantized, and evaluated the quality drop?

Regardless, very cool project.

Zardoz84•20m ago
Meanwhile, Eliza was ported to BASIC and was run on many home computers in the 80s.
anonzzzies•12m ago
Luckily I have a very large amount of MSX computers, zx, amstrad cpc etc and even one multiprocessor z80 cp/m machine for the real power. Wonder how gnarly this is going to perform with bankswitching though. Probably not good.
vatary•7m ago
It's pretty obvious this is just a stress test for compressing and running LLMs. It doesn't have much practical use right now, but it shows us that IoT devices are gonna have built-in LLMs really soon. It's a huge leap in intelligence—kind of like the jump from apes to humans. That is seriously cool.

What an unprocessed photo looks like

https://maurycyz.com/misc/raw_photo/
1252•zdw•10h ago•221 comments

Staying ahead of censors in 2025

https://forum.torproject.org/t/staying-ahead-of-censors-in-2025-what-weve-learned-from-fighting-c...
89•ggeorgovassilis•3h ago•41 comments

Show HN: Z80-μLM, a 'Conversational AI' That Fits in 40KB

https://github.com/HarryR/z80ai
93•quesomaster9000•3h ago•27 comments

You can make up HTML tags

https://maurycyz.com/misc/make-up-tags/
205•todsacerdoti•6h ago•88 comments

Show HN: My not-for-profit search engine with no ads, no AI, & all DDG bangs

https://nilch.org
57•UnmappedStack•3h ago•23 comments

Binaries

https://fzakaria.com/2025/12/28/huge-binaries
28•todsacerdoti•3h ago•11 comments

My First Meshtastic Network

https://rickcarlino.com/notes/electronics/my-first-meshtastic-network.html
36•rickcarlino•3h ago•11 comments

Developing a Beautiful and Performant Block Editor in Qt C++ and QML

https://rubymamistvalove.com/block-editor
18•michaelsbradley•2d ago•3 comments

Unity's Mono problem: Why your C# code runs slower than it should

https://marekfiser.com/blog/mono-vs-dot-net-in-unity/
181•iliketrains•11h ago•85 comments

Software engineers should be a little bit cynical

https://www.seangoedecke.com/a-little-bit-cynical/
182•zdw•11h ago•126 comments

As AI gobbles up chips, prices for devices may rise

https://www.npr.org/2025/12/28/nx-s1-5656190/ai-chips-memory-prices-ram
144•geox•10h ago•162 comments

MongoBleed Explained Simply

https://bigdata.2minutestreaming.com/p/mongobleed-explained-simply
176•todsacerdoti•11h ago•66 comments

Researchers discover molecular difference in autistic brains

https://medicine.yale.edu/news-article/molecular-difference-in-autistic-brains/
119•amichail•10h ago•67 comments

Growing up in “404 Not Found”: China's nuclear city in the Gobi Desert

https://substack.com/inbox/post/182743659
758•Vincent_Yan404•1d ago•337 comments

PySDR: A Guide to SDR and DSP Using Python

https://pysdr.org/content/intro.html
173•kklisura•12h ago•8 comments

Line scan camera image processing

https://daniel.lawrence.lu/blog/2025-09-21-line-scan-camera-image-processing/
30•vasco•3d ago•1 comments

Spherical Cow

https://lib.rs/crates/spherical-cow
88•Natfan•9h ago•8 comments

Formulaic Delimiters in the Iliad and the Odyssey

https://glthr.com/formulaic-delimiters-in-the-iliad-and-the-odyssey
16•glth•1d ago•4 comments

Show HN: My app just won best iOS Japanese learning tool of 2025 award (blog)

https://skerritt.blog/best-japanese-learning-tools-2025-award-show/
103•wahnfrieden•8h ago•14 comments

Fast GPU Interconnect over Radio

https://spectrum.ieee.org/rf-over-fiber
18•montroser•5h ago•1 comments

Mouse: Computer Programming Language

http://mouse.davidgsimpson.com/
9•gappy•2d ago•2 comments

A bitwise reproducible deep learning framework

https://github.com/microsoft/RepDL
22•noosphr•6d ago•0 comments

Slaughtering Competition Problems with Quantifier Elimination (2021)

https://grossack.site/2021/12/22/qe-competition.html
49•todsacerdoti•9h ago•0 comments

Finding Jingle Town: Debugging an N64 Game Without Symbols

https://blog.chrislewis.au/finding-jingle-town-debugging-an-n64-game-without-symbols/
28•knackers•5d ago•1 comments

Fast CVVDP implementation in C

https://github.com/halidecx/fcvvdp
31•todsacerdoti•9h ago•2 comments

How to complain (2024)

https://outerproduct.net/trivial/2024-03-25_complain.html
57•ysangkok•9h ago•10 comments

Why I Disappeared – My week with minimal internet in a remote island chain

https://www.kenklippenstein.com/p/why-i-disappeared
80•eh_why_not•11h ago•82 comments

62 years in the making: NYC's newest water tunnel nears the finish line

https://ny1.com/nyc/all-boroughs/news/2025/11/09/water--dep--tunnels-
120•eatonphil•9h ago•74 comments

Panoramas of Star Trek Sets

https://mijofr.github.io/st-panorama/
53•jfil•3h ago•5 comments

No, it's not a battleship

https://www.navalgazing.net/No-its-not
134•hermitcrab•13h ago•181 comments