This looks great for a first introduction to training LLMs, and it looks simple enough to try this locally. Great job!
jvican•24m ago
If you're interested in this resource, I highly recommend checking out Stanford's CS336 class. It covers all this curriculum in a lot more depth, introduces you into a lot of theoretical aspects (scaling laws, intuitions) and systems thinking (kernel optimization/profiling). For this, you have to do the assignments, of course... https://cs336.stanford.edu/
the_real_cher•9m ago
how does one get the lectures? I don't see the option for any lectures.
baalimago•17m ago
Train your LM from scratch*
I doubt you have a machine big enough to make it "Large".
iamnotarobotman•50m ago