I like their honesty in benchmarks, looks like Qwen3.6 35B is outperforming their Laguna M.1 225B model
Laguna XS.2 33B-A3B params: 30.6
Qwen 3.6 35B-A3B : 51.5
Devstral 2 123B : 31.2
Quite a huge lead for Qwen... well, at least it's catching up to other smaller Western labs.Also, *ops work, which in my experience can actually be more complicated than SWE is underrepresented there obviously.
I usually score pretty well in colour perception tests but distinguishing between those two purples made me doubt myself.
rohitpaulk•1h ago