Our current world is build on top of open source projects. This is possible because there are a lot of free resources to learn to code so anyone from anywhere in the world can learn and make a great piece of software.
I just hope the same will happen with the AI/LLM wave.
What a prolific person Andrej is. It's been more than amazing to follow along!
oh man an Alec x Andrej podcast would BREAK THE INTERNET... just saying... going from glory days of GPT1 to now building GPT3? in 4 hours
Curios to try it someday on a set of specialized documents. Though as I understand the cost of running this is whatever GPU you can rent with 80GB of VRAM. Which kind of leaves hobbyists and students out. Unless some cloud is donating gpu compute capacity.
That sounds like it could run on a 24gb GPU. Batch size of 8 would imply 20gb mem, no?
...presumably just takes forever
I started writing up a blog post on my weekend with nanoGPT but it's not done yet... Would have been great to link to here lol oh well
And this new example goes even further - adds instruction following and tool use SFT, as well as RLVR. Makes for a more useful baseline.
daft_pink•3h ago
huseyinkeles•3h ago
I guess it’s still a work in progress? Couldn’t find any other information elsewhere.
Schiphol•2h ago
karpathy•1h ago
BrokenCogs•24m ago