That said, I am surprised Seedream 4.0 beat it in these tests.
Still, to my eye, ai generated images still feel a bit off when doing with real world photographs.
George's hair, for example, looks over the top, or brushed on.
The tree added to the sleeping person on the ground photo... the tree looks plastic or too homogenized.
It's mostly because image model size and required compute for both training and inference have grown faster than self-hosted compute capability for hobbyists. Sure, you can run Flux Kontext locally, but if you have to use a heavily quantized model and wait forever for the generation to actually run, the economics are harder to justify. That's not counting the "you can generate images from ChatGPT for free" factor.
> George's hair, for example, looks over the top, or brushed on.
IMO, the judge was being too generous with the passes for that test. The only one that really passes is Gemini 2.5 Flash Image:
Flux Kontext: In addition to the hair looking too slick, it does not match the VHS-esque color grading of the image.
Qwen-Image-Edit: The hair is too slick and the sharpness/saturation of the face unnecessarily increases.
Seedream 4: Color grading of the entire image changes, which is the case with most of the Seedream 4 edits shown in this post, and why I don't like it.
Some might critique the prompts and say this or that would have done better, but they were the kind of prompt your dad would type in not knowing how to push the right buttons.
I've been using Nano Banana quite a lot, and I know that it absolutely struggles at exterior architecture and landscaping. Getting it to add or remove things like curbs, walkways, gutters, etc, or to ask to match colors is almost futile.
joomla199•2h ago