LLM as a transformer model cant go a lot, even if all the data in the world is provided to it, it still cannot reach human intelligence, agi is a different path overall
Comments
incomingpain•7h ago
What wall? The only plausible wall I have seen is hardware limits.
If I were to provide you a 100 watt gpu that has 10 trillion cuda cores and 5 TB of vram at DDR120. Deepseek or glm would have a bijilloin tokens per second.
We could train models out to like 10T and run them at home at fast speeds on this imaginary hardware.
Do you think that these models would not be better? Obviously they would be. There's still work to do of course.
incomingpain•7h ago
If I were to provide you a 100 watt gpu that has 10 trillion cuda cores and 5 TB of vram at DDR120. Deepseek or glm would have a bijilloin tokens per second.
We could train models out to like 10T and run them at home at fast speeds on this imaginary hardware.
Do you think that these models would not be better? Obviously they would be. There's still work to do of course.