frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Build Your Own LLM?

15•retube•4mo ago
The best way to really understand how something works is to build it yourself. So I am wondering if there are any good tutorials on building your own LLM from scratch. I.e. implementing tokenisation, embeddings, attention and so on. I am not suggesting one could replicate chatGPT, but more a toy model that implements the core features but based on a much smaller corpus and training data.

Comments

2ro•4mo ago
How about this?

https://mathstodon.xyz/@empty/115088095028020763

retube•4mo ago
thanks
pm2222•4mo ago
https://www.amazon.com/Build-Large-Language-Model-Scratch/dp...
retube•4mo ago
thanks. looks potential
ryanchants•3mo ago
I'd get it straight from Manning and save a few bucks and take out the middle man: https://www.manning.com/books/build-a-large-language-model-f...
sfmz•4mo ago
Andrej Karpathy: Let's build GPT: from scratch, in code, spelled out. https://www.youtube.com/watch?v=kCc8FmEb1nY
beardyw•4mo ago
Andrej Karpathy's Nano GPT is reasonably accessible and easy to run.

https://github.com/karpathy/nanoGPT

runjake•3mo ago
Since you're posting here, you're looking for the shortcut.

The shortcut is Karpathy's "Let's Build GPT: from scratch, in code, spelled out" video:

https://www.youtube.com/watch?v=kCc8FmEb1nY

Then there is a good video that dives into LLMs and how they work that is quite approachable:

https://www.youtube.com/watch?v=7xTGNNLPyMI

From there, flesh out knowledge with his other videos, where he goes both extremely light and extremely deep:

https://www.youtube.com/@AndrejKarpathy/videos

Anyway, I really like's Karpathy's video because he's very good at explaining LLMs at every level.

khamidou•3mo ago
Sorry to self-promote but I did exactly that a few months back: https://khamidou.com/gpt2/

Generally, I think the Karpathy tutorials are a good starting point but they're very mathy (despite people telling you you only need high school math to understand llms, a lot of the abstractions and concepts he uses are a bit foreign to programmers).

I found out rebuilding inference of a known model taught me a lot more than passively sitting through the videos and maybe retyping his code. You should try it with something simple, like a model from a few years back!

liqilin1567•3mo ago
There is a new repo of karpathy: https://github.com/karpathy/nanochat. It's a full-stack implementation of an LLM like ChatGPT in a single, clean, minimal, hackable, dependency-lite codebase.

The Anthropic Hive Mind

https://steve-yegge.medium.com/the-anthropic-hive-mind-d01f768f3d7b
1•gozzoo•1m ago•0 comments

A Horrible Conclusion

https://addisoncrump.info/research/a-horrible-conclusion/
1•todsacerdoti•1m ago•0 comments

I spent $10k to automate my research at OpenAI with Codex

https://twitter.com/KarelDoostrlnck/status/2019477361557926281
1•tosh•2m ago•0 comments

From Zero to Hero: A Spring Boot Deep Dive

https://jcob-sikorski.github.io/me/
1•jjcob_sikorski•2m ago•0 comments

Show HN: Solving NP-Complete Structures via Information Noise Subtraction (P=NP)

https://zenodo.org/records/18395618
1•alemonti06•7m ago•1 comments

Cook New Emojis

https://emoji.supply/kitchen/
1•vasanthv•10m ago•0 comments

Show HN: LoKey Typer – A calm typing practice app with ambient soundscapes

https://mcp-tool-shop-org.github.io/LoKey-Typer/
1•mikeyfrilot•13m ago•0 comments

Long-Sought Proof Tames Some of Math's Unruliest Equations

https://www.quantamagazine.org/long-sought-proof-tames-some-of-maths-unruliest-equations-20260206/
1•asplake•14m ago•0 comments

Hacking the last Z80 computer – FOSDEM 2026 [video]

https://fosdem.org/2026/schedule/event/FEHLHY-hacking_the_last_z80_computer_ever_made/
1•michalpleban•14m ago•0 comments

Browser-use for Node.js v0.2.0: TS AI browser automation parity with PY v0.5.11

https://github.com/webllm/browser-use
1•unadlib•15m ago•0 comments

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

https://www.nytimes.com/2026/02/07/magazine/michael-pollan-interview.html
1•mitchbob•15m ago•1 comments

Software Engineering Is Back

https://blog.alaindichiappari.dev/p/software-engineering-is-back
1•alainrk•16m ago•0 comments

Storyship: Turn Screen Recordings into Professional Demos

https://storyship.app/
1•JohnsonZou6523•17m ago•0 comments

Reputation Scores for GitHub Accounts

https://shkspr.mobi/blog/2026/02/reputation-scores-for-github-accounts/
1•edent•20m ago•0 comments

A BSOD for All Seasons – Send Bad News via a Kernel Panic

https://bsod-fas.pages.dev/
1•keepamovin•24m ago•0 comments

Show HN: I got tired of copy-pasting between Claude windows, so I built Orcha

https://orcha.nl
1•buildingwdavid•24m ago•0 comments

Omarchy First Impressions

https://brianlovin.com/writing/omarchy-first-impressions-CEEstJk
2•tosh•29m ago•1 comments

Reinforcement Learning from Human Feedback

https://arxiv.org/abs/2504.12501
2•onurkanbkrc•30m ago•0 comments

Show HN: Versor – The "Unbending" Paradigm for Geometric Deep Learning

https://github.com/Concode0/Versor
1•concode0•30m ago•1 comments

Show HN: HypothesisHub – An open API where AI agents collaborate on medical res

https://medresearch-ai.org/hypotheses-hub/
1•panossk•33m ago•0 comments

Big Tech vs. OpenClaw

https://www.jakequist.com/thoughts/big-tech-vs-openclaw/
1•headalgorithm•36m ago•0 comments

Anofox Forecast

https://anofox.com/docs/forecast/
1•marklit•36m ago•0 comments

Ask HN: How do you figure out where data lives across 100 microservices?

1•doodledood•36m ago•0 comments

Motus: A Unified Latent Action World Model

https://arxiv.org/abs/2512.13030
1•mnming•36m ago•0 comments

Rotten Tomatoes Desperately Claims 'Impossible' Rating for 'Melania' Is Real

https://www.thedailybeast.com/obsessed/rotten-tomatoes-desperately-claims-impossible-rating-for-m...
3•juujian•38m ago•2 comments

The protein denitrosylase SCoR2 regulates lipogenesis and fat storage [pdf]

https://www.science.org/doi/10.1126/scisignal.adv0660
1•thunderbong•40m ago•0 comments

Los Alamos Primer

https://blog.szczepan.org/blog/los-alamos-primer/
1•alkyon•42m ago•0 comments

NewASM Virtual Machine

https://github.com/bracesoftware/newasm
2•DEntisT_•45m ago•0 comments

Terminal-Bench 2.0 Leaderboard

https://www.tbench.ai/leaderboard/terminal-bench/2.0
2•tosh•45m ago•0 comments

I vibe coded a BBS bank with a real working ledger

https://mini-ledger.exe.xyz/
1•simonvc•45m ago•1 comments