fp.
newest
Open in hackernews
Does RL Incentivize Reasoning Capacity in LLMs Beyond the Base Model?
https://www.lesswrong.com/posts/s3NaETDujoxj4GbEm/tsinghua-paper-does-rl-really-incentivize-reasoning-capacity
2
•
fzliu
•
7mo ago