Most frontier labs are converging on the same capability ceiling — likely because the training data is basically the same internet. If that's the bottleneck, the next real leap is a pre-training problem, not a post-training one. Another RLHF variant won't get us there.