Those who follow fine-tunes of LLMs may know that there’s a company called Nous Research has been releasing a series of fine-tuned models called the Hermes, which seem to have great performance.
Since post-training is relatively cheaper than pre-training, “so” I also want to get into post-training and fine-tuning. Given that I'm GPU poor, with only a M4 MBP and some Tinker credits, so I was wondering if you have any advice and/or recommendations for getting into post-training? For instance, do you think this book https://www.manning.com/books/the-rlhf-book is a good place to start? If not, what’s your other recommendations?
I’m also currently reading “Hands-on LLM” and “Build a LLM from scratch” if that helps.
Many thanks for your time!