fp.
newest
Open in hackernews
Avatarl: Training language models from scratch with pure reinforcement learning
https://tokenbender.com/post.html?id=avatarl
9
•
Gusarich
•
6mo ago