newest
Open in hackernews
Avatarl: Training language models from scratch with pure reinforcement learning
https://tokenbender.com/post.html?id=avatarl
3
•
Gusarich
•
2h ago