It seems to just be worse at actually doing what you ask.
I feel like it would be advantageous to move away from a "one model fits all" mindset, and move towards a world where we have different genres of models that we use for different things.
The benchmark scores are turning into being just as useful as tomatometer movie scores. Something can score high, but if that's not the genre you like, the high score doesn't guarantee you'll like it.
https://thezvi.substack.com/p/gemini-3-pro-is-a-vast-intelli...
And with that, I will never read anything this guy writes again :)
aworks•49m ago
esafak•29m ago
I think customer diversity correlates instead with resilience.