The fantasy boosters in full AGI phallus stroking kit:
Two close friends of mine who were math prodigies that went on to do ML very early (mid 2010's) were always talking to me about an algorithm that sounds similar to this:
This is wild!
"when assessed by Claude 3.5 Sonnet’s production-grade RM, our unsupervised assistant policy wins 60% of head-to-head comparisons against the policy trained with the human-supervised RM." So now the models can even post-train the new models better than a human can
This was my first idea as well. Keep training continuously and redeploy clones after each cycle. From a layman perspective this seems reasonable :thinking:
and finally a realist!!!!!
khalic 3 months ago | root | parent | next [–]
There is no sign that LLMs are capable of general reasoning, on the contrary, so hold your horses about that. We have proven they can do basic composition (as a developer, I see proof of this every time I generate some code with an assistant) which is amazing already, but we’re still far from anything like “general intelligence”.
reify•2h ago
June
https://arxiv.org/html/2506.10943v1
https://www.researchgate.net/publication/392629858_Self-Adap...
4 months ago:
https://news.ycombinator.com/item?id=44271284
The fantasy boosters in full AGI phallus stroking kit:
Two close friends of mine who were math prodigies that went on to do ML very early (mid 2010's) were always talking to me about an algorithm that sounds similar to this:
This is wild!
"when assessed by Claude 3.5 Sonnet’s production-grade RM, our unsupervised assistant policy wins 60% of head-to-head comparisons against the policy trained with the human-supervised RM." So now the models can even post-train the new models better than a human can
This was my first idea as well. Keep training continuously and redeploy clones after each cycle. From a layman perspective this seems reasonable :thinking:
and finally a realist!!!!!
khalic 3 months ago | root | parent | next [–]
There is no sign that LLMs are capable of general reasoning, on the contrary, so hold your horses about that. We have proven they can do basic composition (as a developer, I see proof of this every time I generate some code with an assistant) which is amazing already, but we’re still far from anything like “general intelligence”.