[A B]
times [1]
[1]
is [A+B]I was initially excited until i saw that, because it would reveal some sort of required local min capacity, and then further revelation that this was all vibe coded and no arXiv, makes me feel I should save my attn for another article.
amelius•1h ago
I wonder why they don't just write the code themselves, so by design the focus can be on the model.