What the actual fuck
---
A desolate grassland stretches into the distance, its ground dry and cracked. Fine dust is kicked up by vigorous activity, forming a faint grayish-brown mist in the low sky.
Mid-ground, eye-level composition: A muscular, robust adult brown horse stands proudly, its forelegs heavily pressing between the shoulder blades and spine of a reclining man. Its hind legs are taut, its neck held high, its mane flying against the wind, its nostrils flared, and its eyes sharp and focused, exuding a primal sense of power. The subdued man is a white male, 30-40 years old, his face covered in dust and sweat, his short, messy dark brown hair plastered to his forehead, his thick beard slightly damp; he wears a badly worn, grey-green medieval-style robe, the fabric torn and stained with mud in several places, a thick hemp rope tied around his waist, and scratched ankle-high leather boots; his body is in a push-up position—his palms are pressed hard against the cracked, dry earth, his knuckles white, the veins in his arms bulging, his legs stretched straight back and taut, his toes digging into the ground, his entire torso trembling slightly from the weight.
The background is a range of undulating grey-blue mountains, their outlines stark, their peaks hidden beneath a low-hanging, leaden-grey, cloudy sky. The thick clouds diffuse a soft, diffused light, which pours down naturally from the left front at a 45-degree angle, casting clear and voluminous shadows on the horse's belly, the back of the man's hands, and the cracked ground.
The overall color scheme is strictly controlled within the earth tones: the horsehair is warm brown, the robe is a gradient of gray-green-brown, the soil is a mixture of ochre, dry yellow earth, and charcoal gray, the dust is light brownish-gray, and the sky is a transition from matte lead gray to cool gray with a faint glow at the bottom of the clouds.
The image has a realistic, high-definition photographic quality, with extremely fine textures—you can see the sweat on the horse's neck, the wear and tear on the robe's warp and weft threads, the skin pores and stubble, the edges of the cracked soil, and the dust particles. The atmosphere is tense, primitive, and full of suffocating tension from a struggle of biological forces.
But if you translate the actual prompt the term riding doesn't even appear. The prompt describes the exact thing you see in excruciating detail.
"... A muscular, robust adult brown horse standing proudly, its forelegs heavily pressing between the shoulder blades and spine of a reclining man ... and its eyes sharp and focused, exuding a primal sense of power. The subdued man is a white male, 30-40 years old, his face covered in dust and sweat ... his body is in a push-up position—his palms are pressed hard against the cracked, dry earth, his knuckles white, the veins in his arms bulging, his legs stretched straight back and taut, his toes digging into the ground, his entire torso trembling slightly from the weight ..."
LinkedIn is filled with them now.
Much like the pointless ASCII diagrams in GitHub readmes (big rectangle with bullet points flows to another...), the diagrams are cognitive slurry.
See Gas Town for non-Qwen examples of how bad it can get:
https://news.ycombinator.com/item?id=46746045
(Not commenting on the other results of this model outside of diagramming.)
What Linux tools are you guys using for image generation models like Qwen's diffusion models, since LMStudio only supports text gen.
Do western AI models mostly default to white people?
No, they mostly default to black people even in historical contexts where they are completely out of place, actually. [1]
"Google paused its AI image-generator after Gemini depicted America's founding fathers and Nazi soldiers as Black. The images went viral, embarrassing Google."
[1] https://www.npr.org/2024/03/18/1239107313/google-races-to-fi...
You're referring to a case of one version of one model. That's not "mostly" or "default to".
> Generate a photo of the founding fathers of a future, non-existing country. Five people in total.
with Nano Banana Pro (the SOTA). I tried the same prompt 5 times and ever time black people are the majority. So yeah, I think the parent comment is not that far off.
A muscular, robust adult brown horse stands proudly, its forelegs heavily pressing between the shoulder blades and spine of a reclining man. Its hind legs are taut, its neck held high, its mane flying against the wind, its nostrils flared, and its eyes sharp and focused, exuding a primal sense of power. The subdued man is a white male...
[1] The photo of the outfit: https://share.google/mHJbchlsTNJ771yBa
Deukhoofd•1h ago
belter•1h ago
BoredPositron•1h ago
likium•1h ago
derefr•1h ago
Like focus stacking, specifically.
I’m always surprised when people bother to point out more-subtle flaws in AI images as “tells”, when the “depth-of-field problem” is so easily spotted, and has been there in every AI image ever since the earliest models.
Mashimo•1h ago
But I found that that results in more professional looking images, and not more realistic photos.
Adding something like "selfy, Instagram, low resolution, flash" can lead to a .. worse image that looks more realistic.
[0] I think I did this one with z image turbo on my 4060 ti
afro88•52m ago
albumen•1h ago
GaggiX•1h ago
Mashimo•1h ago
cubefox•59m ago
GaggiX•37m ago
Also Imagen 4 and Nano Banana Pro are very different models.
finnjohnsen2•1h ago
I assume our brains are used to stuff which we dont notice conciously, and reject very mild errors. I've stared at the picture a bit now and the finger holding the baloon is weird. The out of place snowman feels weird. If you follow the background blur around it isnt at the same depth everywehere. Everything that reflects, has reflections that I cant see in the scene.
I dont feel good staring at it now so I had to stop.
jbl0ndie•1h ago
elorant•53m ago
techpression•34m ago