I’ve been exploring the idea of building efficient large language models — ones optimized for memory use and inference speed, especially for real-time and edge deployment.
I’ve come across concepts like Hierarchical Reasoning Models and Tiny Recursive Models, which seem strong on reasoning benchmarks like ARC-AGI, but don’t appear to have been applied to language generation yet.
I’ve also looked into spiking neural networks, which look promising in theory but still seem to struggle with more complex tasks.
Curious if the area of efficient LLMs is still an active area of research.
Would love to hear your thoughts and connect with anyone interested in this space!
te0006•1h ago
SergiuNistor•1h ago