Show HN: I built a toy TPU that can do inference and training on the XOR problem

https://www.tinytpu.com

34•evxxan•3h ago

We wanted to do something very challenging to prove to ourselves that we can do anything we put our mind to. The reasoning for why we chose to build a toy TPU specifically is fairly simple:

- Building a chip for ML workloads seemed cool - There was no well-documented open source repo for an ML accelerator that performed both inference and training

None of us have real professional experience in hardware design, which, in a way, made the TPU even more appealing since we weren't able to estimate exactly how difficult it would be. As we worked on the initial stages of this project, we established a strict design philosophy: TO ALWAYS TRY THE HACKY WAY. This meant trying out the "dumb" ideas that came to our mind first BEFORE consulting external sources. This philosophy helped us make sure we weren't reverse engineering the TPU, but rather re-inventing it, which helped us derive many of the key mechanisms used in the TPU ourselves.

We also wanted to treat this project as an exercise to code without relying on AI to write for us, since we felt that our initial instinct recently has been to reach for llms whenever we faced a slight struggle. We wanted to cultivate a certain style of thinking that we could take forward with us and use in any future endeavours to think through difficult problems.

Throughout this project we tried to learn as much as we could about the fundamentals of deep learning, hardware design and creating algorithms and we found that the best way to learn about this stuff is by drawing everything out and making that our first instinct. In tinytpu.com, you will see how our explanations were inspired by this philosophy.

Note that this is NOT a 1-to-1 replica of the TPU--it is our attempt at re-inventing a toy version of it ourselves.

Comments

skybrian•1h ago

It's unclear to me what the end result is. Did you build real hardware or is it simulated somehow? If it's hardware, what kind and how did you make it?

antognini•1h ago

Based on the code in the repo it looks like they designed the chip in verilog and then ran it in a simulator. But if they have the verilog code in principle they could send it off to a fab and get real hardware back.

jacquesm•1h ago

Verilog spec by the looks of it. So you should be able to make it work on an FPGA or if you happen to have a chip fab in your garage you might want to make your own silicon ;) I'd go the FPGA route.

zhainya•1h ago

I feel like I missed a whole section somewhere. "Built a toy TPU". What does that mean? I have no idea what was actually "built" here.

evxxan•27m ago

By "toy TPU", we simulated forward pass + backprop on a minimal tpu-like accelerator.

evxxan•29m ago

all in simulation :)

jacquesm•1h ago

Sometimes it is the projects where you don't know that you really don't know what you are doing that are the most satisfying, kudos, amazing work you have done.

evxxan•8m ago

Thank you!

Obsidian Bases

Show HN: Fractional jobs – part-time roles for engineers

Lab-Grown Salmon Hits the Menu at an Oregon Restaurant as the FDA Greenlights

Shamelessness as a strategy (2019)

What could have been

A minimal tensor processing unit (TPU), inspired by Google's TPU

Show HN: Whispering – Open-source, local-first dictation you can trust

Show HN: We started building an AI dev tool but it turned into a Sims-style game

Left to Right Programming

Spice Data (YC S19) Is Hiring a Product Associate (New Grad)

The Rising Returns to R&D: Ideas Are Not Getting Harder to Find

Newsmax agrees to pay $67M in defamation case over bogus 2020 election claims

Counter-Strike: A billion-dollar game built in a dorm room

Show HN: I built an app to block Shorts and Reels

Anna's Archive: An Update from the Team

FFmpeg Assembly Language Lessons

Show HN: I built a toy TPU that can do inference and training on the XOR problem

An IRC-Enabled Lawn Mower

GenAI FOMO has spurred businesses to light nearly $40B on fire

Sikkim and the Himalayan Chess Game (2016)

T-Mobile claimed selling location data without consent is legal–judges disagree

Phrack 72

Structured (Synchronous) Concurrency

The Cutaway Illustrations of Fred Freeman (2016)

Launch HN: Reality Defender (YC W22) – API for Deepfake and GenAI Detection

Typechecker Zoo

Mindless Machines, Mindless Myths

The lottery ticket hypothesis: why neural networks work

The Weight of a Cell

How much do electric car batteries degrade?

Obsidian Bases

Show HN: Fractional jobs – part-time roles for engineers

Lab-Grown Salmon Hits the Menu at an Oregon Restaurant as the FDA Greenlights

Shamelessness as a strategy (2019)

What could have been

A minimal tensor processing unit (TPU), inspired by Google's TPU

Show HN: Whispering – Open-source, local-first dictation you can trust

Show HN: We started building an AI dev tool but it turned into a Sims-style game

Left to Right Programming

Spice Data (YC S19) Is Hiring a Product Associate (New Grad)

The Rising Returns to R&D: Ideas Are Not Getting Harder to Find

Newsmax agrees to pay $67M in defamation case over bogus 2020 election claims

Counter-Strike: A billion-dollar game built in a dorm room

Show HN: I built an app to block Shorts and Reels

Anna's Archive: An Update from the Team

FFmpeg Assembly Language Lessons

Show HN: I built a toy TPU that can do inference and training on the XOR problem

An IRC-Enabled Lawn Mower

GenAI FOMO has spurred businesses to light nearly $40B on fire

Sikkim and the Himalayan Chess Game (2016)

T-Mobile claimed selling location data without consent is legal–judges disagree

Phrack 72

Structured (Synchronous) Concurrency

The Cutaway Illustrations of Fred Freeman (2016)

Launch HN: Reality Defender (YC W22) – API for Deepfake and GenAI Detection

Typechecker Zoo

Mindless Machines, Mindless Myths

The lottery ticket hypothesis: why neural networks work

The Weight of a Cell

How much do electric car batteries degrade?

Show HN: I built a toy TPU that can do inference and training on the XOR problem

Comments