242 Elo points clear of the next best model and 93% win rate against random models (96% against nano banana) while Gemini 3.1 (second best) sits at 67%. That’s quite the leap.
brcmthrowaway•1h ago
How is text to image even scored? Seems like a subjective measurement..
be7a•1h ago
Users get two completions for their prompt and rank them. From this you can then use Bradley-Terry to get Elo scores per model.
gpt5•1h ago
https://arena.ai/leaderboard/text-to-image
be7a•1h ago
brcmthrowaway•1h ago
be7a•1h ago