Did current models overcome the 10:10 bias?
I'm pretty sure that only Gemini made it. Other models did not meet the 'each tentacle covered' criteria.
Is also pretty obvious that the models have some built in prompt system rules that makes the final output a certain style. They seem very consistent
It also looks like 40 has the temperature turned way down, to ensure max adherence, while midjourney etc seem to have higher temperature.more interesting end results, flourishing, complex Materials and backgrounds
Also what's with 4o's sepia tones. Post editing in the gen workflows?
I don't believe any of these just generate the image though, there's likely several steps in each workflows to present the final images outputted to the user in the absolute best light.
The other stuff is text to image (not editing)
If this one were shown in a US work environment, I might say a collegial something privately to the person, about it not seeming the most work-appropriate.
The title of this article is "image editing showdown", but the subject is actually prompt adherence in image generation from prompting.
Midjourney and Flux Dev aren't image editing models. (Midjourney is an aesthetically pleasing image generation model with low prompt adherence.)
Image editing is a task distinct from image generation. Image editing models include Nano Banana (Gemini Flash), Flux Kontext, and a handful of others. gpt-image-1 sort of counts, though it changes the global image pixels such that it isn't 1:1 with the input.
Edit: Dang, can you please fix this? Someone else posted the actual link, and it's far more interesting than the linked article:
https://genai-showdown.specr.net/image-editing
This article is great.
isoprophlex•34m ago
I don't fully understand the iterative methodology tho - they allow multiple attempts, which are judged by another multimodal llm? Won't they have limited accuracy in itself?