* Better than potential successors of SOTA transformers: Mamba, Hyenna, RWKV, xLSTM; * Gemini/claude estimate potential ip values in the millions; * All implemented in C (ok this may be a minus) but also ported to C# and F# - so no python nor rust; * humble code size: 10-15k lines of code but i mean come on gpt-1 was under 1000 lines...in python.
In short:
the problem: transformers are slow and could be smarter; solution: you have fast and smart alternative. 10x!