frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Train Your Own LLM from Scratch

https://github.com/angelos-p/llm-from-scratch
72•kristianpaul•2h ago

Comments

iamnotarobotman•1h ago
This looks great for a first introduction to training LLMs, and it looks simple enough to try this locally. Great job!
jvican•1h ago
If you're interested in this resource, I highly recommend checking out Stanford's CS336 class. It covers all this curriculum in a lot more depth, introduces you into a lot of theoretical aspects (scaling laws, intuitions) and systems thinking (kernel optimization/profiling). For this, you have to do the assignments, of course... https://cs336.stanford.edu/
the_real_cher•1h ago
how does one get the lectures? I don't see the option for any lectures.
eftychis•41m ago
https://github.com/stanford-cs336/lectures
baalimago•1h ago
Train your LM from scratch*

I doubt you have a machine big enough to make it "Large".

nucleardog•44m ago
Hey now! I've got a half terabyte of RAM at my disposal! I mean, it's DDR4 but... it's RAM!

And it's paired with 48 processor cores! I mean, they don't even support AVX512 but they can do math!

I could totally train a LLM! Or at least my family could... might need my kid to pick up and carry on the project.

But in all seriousness... you either missed the point, are being needlessly pedantic, or are... wrong?

This is about learning concepts, and the rest of this is mostly moot.

On the pedantic or wrong notes--What is the documented cut-off for a "large" language model? Because GPT-2 was and is described as a "large" language model. It had 1.5B parameters. You can just about get a consumer GPU capable of training that for about $400 these days.

mips_avatar•30m ago
You can fully train a 1.6b model on a single 3090. That’s a reasonably big model.
hiroakiaizawa•48m ago
Nice. What scale does this realistically reach on a single machine?
NSUserDefaults•20m ago
Been doing it since the day I was born. The beginnings were hard but I’m getting there.

Train Your Own LLM from Scratch

https://github.com/angelos-p/llm-from-scratch
76•kristianpaul•2h ago•9 comments

About 10% of AMC movie showings sell zero tickets. This site finds them

https://walzr.com/empty-screenings
104•MrBuddyCasino•2h ago•75 comments

Bun is being ported from Zig to Rust

https://github.com/oven-sh/bun/commit/46d3bc29f270fa881dd5730ef1549e88407701a5
387•SergeAx•5h ago•255 comments

CVE-2026-31431: Copy Fail vs. rootless containers

https://www.dragonsreach.it/2026/05/04/cve-2026-31431-copy-fail-rootless-containers/
69•averi•2h ago•20 comments

How OpenAI delivers low-latency voice AI at scale

https://openai.com/index/delivering-low-latency-voice-ai-at-scale/
369•Sean-Der•10h ago•117 comments

Hand Drawn QR Codes

https://sethmlarson.dev/hand-drawn-qr-codes
23•jollyjerry•2h ago•1 comments

The Car That Watches You Back: The Advertising Infrastructure of Modern Cars

https://nobodyaskedforthis.lol/posts/connected-car/
56•cadito•4h ago•35 comments

Agent Skills

https://addyosmani.com/blog/agent-skills/
209•BOOSTERHIDROGEN•8h ago•92 comments

Gaps in national food production, worldwide

https://www.nature.com/articles/s43016-025-01173-4
41•simonebrunozzi•17h ago•17 comments

Nocturnal migratory birds follow rhythm of the moon

https://www.lunduniversity.lu.se/article/nocturnal-migratory-birds-follow-rhythm-moon
5•hhs•2d ago•0 comments

Securing a DoD contractor: Finding a multi-tenant authorization vulnerability

https://www.strix.ai/blog/how-strix-found-zero-auth-vulnerability-dod-backed-startup
190•bearsyankees•12h ago•79 comments

pgxbackup: Continuity Support for pgBackRest

https://thebuild.com/blog/2026/05/01/pgxbackup-continuity-support-for-pgbackrest/
31•Wingy•2d ago•4 comments

When Networking Doesn't Work

https://www.os2museum.com/wp/when-networking-doesnt-work/
49•kencausey•9h ago•7 comments

Does Employment Slow Cognitive Decline? Evidence from Labor Market Shocks

https://www.nber.org/papers/w35117
264•littlexsparkee•15h ago•241 comments

2-D Mathematical Curves

https://www.2dcurves.com/
7•the-mitr•2h ago•0 comments

Redis array: short story of a long development process

https://antirez.com/news/164
266•antirez•16h ago•87 comments

Testing macOS on the Apple Network Server 2.0 ROMs

http://oldvcr.blogspot.com/2026/05/testing-macos-on-apple-network-server.html
81•zdw•1d ago•16 comments

Talking to strangers at the gym

https://thienantran.com/talking-to-35-strangers-at-the-gym/
1297•thitran•18h ago•613 comments

Kids bypass age verification with fake moustaches

https://www.theregister.com/2026/05/04/uk_online_safety_act_age_checks_subvert/
33•dreadsword•2h ago•8 comments

1966 Ford Mustang Converted into a Tesla with Working 'Full Self-Driving'

https://electrek.co/2026/05/02/tesla-1966-mustang-ev-conversion-full-self-driving/
162•Brajeshwar•15h ago•116 comments

What I'm Hearing About Cognitive Debt (So Far)

https://margaretstorey.com/blog/2026/02/18/cognitive-debt-revisited/
177•raphaelcosta•4h ago•95 comments

Formatting a 25M-line codebase overnight

https://stripe.dev/blog/formatting-an-entire-25-million-line-codebase-overnight-the-rubyfmt-story
150•r00k•10h ago•78 comments

Microsoft Edge stores all passwords in memory in clear text, even when unused

https://twitter.com/L1v1ng0ffTh3L4N/status/2051308329880719730
498•cft•12h ago•180 comments

Y Combinator's Stake in OpenAI (0.6%?)

https://daringfireball.net/2026/05/y_combinators_stake_in_openai
262•gyomu•6h ago•33 comments

I am worried about Bun

https://wwj.dev/posts/i-am-worried-about-bun/
462•remote-dev•13h ago•310 comments

How Monero’s proof of work works

https://blog.alcazarsec.com/tech/posts/how-moneros-proof-of-work-works
273•alcazar•16h ago•192 comments

PyInfra 3.8.0

https://github.com/pyinfra-dev/pyinfra/releases/tag/v3.8.0
257•wowi42•17h ago•88 comments

Pomiferous: The most extensive apples (pommes) database

https://pomiferous.com/
118•Ariarule•15h ago•47 comments

GameStop makes $55.5B takeover offer for eBay

https://www.bbc.co.uk/news/articles/cn0p8yled1do
667•n1b0m•21h ago•639 comments

UK Fuel Price Intelligence – Market analytics from reporting stations

https://www.fuelinsight.co.uk
171•theazureguy•15h ago•81 comments