frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Zlob.h 100% POSIX and glibc compatible globbing lib that is faste and better

https://github.com/dmtrKovalenko/zlob
1•neogoose•2m ago•1 comments

Show HN: Deterministic signal triangulation using a fixed .72% variance constant

https://github.com/mabrucker85-prog/Project_Lance_Core
1•mav5431•3m ago•1 comments

Scientists Discover Levitating Time Crystals You Can Hold, Defy Newton’s 3rd Law

https://phys.org/news/2026-02-scientists-levitating-crystals.html
1•sizzle•3m ago•0 comments

When Michelangelo Met Titian

https://www.wsj.com/arts-culture/books/michelangelo-titian-review-the-renaissances-odd-couple-e34...
1•keiferski•4m ago•0 comments

Solving NYT Pips with DLX

https://github.com/DonoG/NYTPips4Processing
1•impossiblecode•4m ago•1 comments

Baldur's Gate to be turned into TV series – without the game's developers

https://www.bbc.com/news/articles/c24g457y534o
1•vunderba•5m ago•0 comments

Interview with 'Just use a VPS' bro (OpenClaw version) [video]

https://www.youtube.com/watch?v=40SnEd1RWUU
1•dangtony98•10m ago•0 comments

EchoJEPA: Latent Predictive Foundation Model for Echocardiography

https://github.com/bowang-lab/EchoJEPA
1•euvin•18m ago•0 comments

Disablling Go Telemetry

https://go.dev/doc/telemetry
1•1vuio0pswjnm7•20m ago•0 comments

Effective Nihilism

https://www.effectivenihilism.org/
1•abetusk•23m ago•1 comments

The UK government didn't want you to see this report on ecosystem collapse

https://www.theguardian.com/commentisfree/2026/jan/27/uk-government-report-ecosystem-collapse-foi...
2•pabs3•25m ago•0 comments

No 10 blocks report on impact of rainforest collapse on food prices

https://www.thetimes.com/uk/environment/article/no-10-blocks-report-on-impact-of-rainforest-colla...
1•pabs3•25m ago•0 comments

Seedance 2.0 Is Coming

https://seedance-2.app/
1•Jenny249•27m ago•0 comments

Show HN: Fitspire – a simple 5-minute workout app for busy people (iOS)

https://apps.apple.com/us/app/fitspire-5-minute-workout/id6758784938
1•devavinoth12•27m ago•0 comments

Dexterous robotic hands: 2009 – 2014 – 2025

https://old.reddit.com/r/robotics/comments/1qp7z15/dexterous_robotic_hands_2009_2014_2025/
1•gmays•31m ago•0 comments

Interop 2025: A Year of Convergence

https://webkit.org/blog/17808/interop-2025-review/
1•ksec•41m ago•1 comments

JobArena – Human Intuition vs. Artificial Intelligence

https://www.jobarena.ai/
1•84634E1A607A•45m ago•0 comments

Concept Artists Say Generative AI References Only Make Their Jobs Harder

https://thisweekinvideogames.com/feature/concept-artists-in-games-say-generative-ai-references-on...
1•KittenInABox•48m ago•0 comments

Show HN: PaySentry – Open-source control plane for AI agent payments

https://github.com/mkmkkkkk/paysentry
2•mkyang•50m ago•0 comments

Show HN: Moli P2P – An ephemeral, serverless image gallery (Rust and WebRTC)

https://moli-green.is/
2•ShinyaKoyano•1h ago•1 comments

The Crumbling Workflow Moat: Aggregation Theory's Final Chapter

https://twitter.com/nicbstme/status/2019149771706102022
1•SubiculumCode•1h ago•0 comments

Pax Historia – User and AI powered gaming platform

https://www.ycombinator.com/launches/PMu-pax-historia-user-ai-powered-gaming-platform
2•Osiris30•1h ago•0 comments

Show HN: I built a RAG engine to search Singaporean laws

https://github.com/adityaprasad-sudo/Explore-Singapore
3•ambitious_potat•1h ago•4 comments

Scams, Fraud, and Fake Apps: How to Protect Your Money in a Mobile-First Economy

https://blog.afrowallet.co/en_GB/tiers-app/scams-fraud-and-fake-apps-in-africa
1•jonatask•1h ago•0 comments

Porting Doom to My WebAssembly VM

https://irreducible.io/blog/porting-doom-to-wasm/
2•irreducible•1h ago•0 comments

Cognitive Style and Visual Attention in Multimodal Museum Exhibitions

https://www.mdpi.com/2075-5309/15/16/2968
1•rbanffy•1h ago•0 comments

Full-Blown Cross-Assembler in a Bash Script

https://hackaday.com/2026/02/06/full-blown-cross-assembler-in-a-bash-script/
1•grajmanu•1h ago•0 comments

Logic Puzzles: Why the Liar Is the Helpful One

https://blog.szczepan.org/blog/knights-and-knaves/
1•wasabi991011•1h ago•0 comments

Optical Combs Help Radio Telescopes Work Together

https://hackaday.com/2026/02/03/optical-combs-help-radio-telescopes-work-together/
2•toomuchtodo•1h ago•1 comments

Show HN: Myanon – fast, deterministic MySQL dump anonymizer

https://github.com/ppomes/myanon
1•pierrepomes•1h ago•0 comments
Open in hackernews

GenAI Image Editing Showdown

https://genai-showdown.specr.net/
201•rzk•3mo ago

Comments

isoprophlex•3mo ago
The "editing" showdown is very good. Introduced me to the Seedream model which i didn't know about until now.

I don't fully understand the iterative methodology tho - they allow multiple attempts, which are judged by another multimodal llm? Won't they have limited accuracy in itself?

ACCount37•3mo ago
"LLMs judged by LLMs" is the industry standard. Can't put a human judge in a box and have him evaluate and rate a set of 7600 responses on demand.

Now, are LLM judges flawed? Obviously. But they are more shelf stable than humans, so it's easier to compare different results. And as long as you use an LLM judge as a performance thermometer and not a direct optimization target, you aren't going to be facing too many issues from that.

If you are using an LLM judge as a direct optimization target though? You'll see some funny things happen. Like GPT-5 prose. Which isn't even the weirdest it gets.

vunderba•3mo ago
I tried to make the judgement criteria more clear in the FAQ section - I'll post it here:

What is the metric on which these models are being judged?

  It's hard to define a discrete rubric for grading at an inherently qualitative level. To keep things simple, this test is purely PASS/FAIL - unsuccessful means that the model NEVER managed to generate an image adhering to the prompt. For example, Midjourney 7 did not manage to generate the correct vertical stack of translucent cubes ordered by color in 64 generation attempts. In many cases, we often attempt a generous intepretation of the prompt - if it gets close enough, we might consider it a pass.

  Put another way: if I were to show the final image to a random stranger on the street, would they be able to guess what the original prompt was? (aka the Pictionary test).

  To paraphrase former Supreme Court Justice Potter Stewart, "I may not be able to define a passing image, but I know it when I see it."

To answer your question, the pass/fail is manually determined according to a set of well-defined criteria which is usually specified alongside the image.
sans_souse•3mo ago
I had to upvote immediately once I got to Alexander the Great on a Hippity Hop
halflife•3mo ago
The horse chimera is much better
adriand•3mo ago
I had completely forgotten about the hippity hop and coming across it here brought back all kinds of childhood memories. Those things were fun!
mrec•3mo ago
They were always called "space hoppers" here in the UK, and always looked like this:

https://en.wikipedia.org/wiki/Space_hopper#/media/File:Space...

croes•3mo ago
What about the classic: A analog watch that shows the time 08:15?

Did current models overcome the 10:10 bias?

echelon•3mo ago
This would be easy to patch the models to fix. Just gather a small amount of training data for these cases, eg. "change the clock hands to 5:30" with the corresponding edit.

Three tuple: (original image, text edit instruction, final image).

Easy to patch for editing models, anyway. Maybe not text to image models.

fart-fart-FART•3mo ago
waste of money and effort, imo. there are +- infinity of such small things to fix.
echelon•3mo ago
It's a common enough use case that it'll be added at some point.

It probably comes up more than you think. Storyboarding, product placement, model images, etc.

It's not critical in the short term, but it'll wind up on their backlog for sure.

konart•3mo ago
>Cephalopodic Puppet Show

I'm pretty sure that only Gemini made it. Other models did not meet the 'each tentacle covered' criteria.

jedbrooke•3mo ago
for the OpenAI 4o model on the octopus sock puppet prompt, the prompt clearly states that each tentacle should have a sock puppet, whereas the OpenAI 4o image only has 6 puppets with 2 tentacles being puppetless. I’m not sure if we can call that a pass
snowfield•3mo ago
I'd assume that behind the scenes the models generate several passes and only show the user the best one, that would be smart, as to to make it seem their model is better than others

Is also pretty obvious that the models have some built in prompt system rules that makes the final output a certain style. They seem very consistent

It also looks like 40 has the temperature turned way down, to ensure max adherence, while midjourney etc seem to have higher temperature.more interesting end results, flourishing, complex Materials and backgrounds

Also what's with 4o's sepia tones. Post editing in the gen workflows?

I don't believe any of these just generate the image though, there's likely several steps in each workflows to present the final images outputted to the user in the absolute best light.

phi-go•3mo ago
There are numbers on how many tries it took. I would also find the individual prompts and images interesting.
simonw•3mo ago
You can run some image models locally if you want to prove to yourself how well they can do with just a single generation from a prompt with no extra steps.

I've done this enough to suspect that most hosted image models don't increase their running costs to try and get better results through additional passes without letting the user know what they are doing.

Many of the LLM-driven models do implement a form of prompt rewriting though (since effectively prompting image models is really hard) - some notes on how DALL-E 3 did that here: https://simonwillison.net/2023/Oct/26/add-a-walrus/

thorum•3mo ago
Actual link seems to be: https://genai-showdown.specr.net/image-editing
typpilol•3mo ago
This is the editing link yes. I just got done looking at it from the other link.

The other stuff is text to image (not editing)

neilv•3mo ago
> "A dolphin is using its fluke to discipline a mermaid by paddling it across the backside."

If this one were shown in a US work environment, I might say a collegial something privately to the person, about it not seeming the most work-appropriate.

PieTime•3mo ago
I think I’d probably say that the prompts are telling me more about the author than I think is necessary for these tests… I hope they were at least sampled from responses.
echelon•3mo ago
Please fix the title, or change the link.

The title of this article is "image editing showdown", but the subject is actually prompt adherence in image generation from prompting.

Midjourney and Flux Dev aren't image editing models. (Midjourney is an aesthetically pleasing image generation model with low prompt adherence.)

Image editing is a task distinct from image generation. Image editing models include Nano Banana (Gemini Flash), Flux Kontext, and a handful of others. gpt-image-1 sort of counts, though it changes the global image pixels such that it isn't 1:1 with the input.

I expect that as image editing models get better and more "instructive", classical tools like Photoshop and modern hacks like ComfyUI will both fall away to a thin fascade over the models themselves. Adobe needs to figure out their future, because Photoshop's days are numbered.

Edit: Dang, can you please fix this? Someone else posted the actual link, and it's far more interesting than the linked article:

https://genai-showdown.specr.net/image-editing

This article is great.

jumploops•3mo ago
Slight nit: it lists “OpenAI 4o” but the model used by ChatGPT is a distinct model labeled “gpt-image-1” iirc

A prompt id love to see: person riding in a kangaroo pouch.

Most of the pure diffusion models haven’t been able to do it in my experience.

Edit: another commenter pointed out the analog clock test, lets add the “analog clock showing 3:15” as well (:

ZiiS•3mo ago
The link is to the imagegen test not the editing one. Here 4o was used to preprocess the prompt.
snailmailman•3mo ago
There isn’t a date in the article, but I know I had read this months ago. And sure enough, wayback has the text-to-image page from April.

But the image editing page linked at the top is more recent, and was added sometime in September. (And was presumably the intended link) I hadn’t read that page yet. Odd there is no dates, at first glance one might think the pages were made at the same time.

jonplackett•3mo ago
Yeah this is very old. Although anything older than a week is reasonably old in AI.
foofoo12•3mo ago
> There isn’t a date in the article

SEO guys convinced everyone that articles without dates do better on search engines. I hope both sides of their pillow is hot.

ljlolel•3mo ago
I discovered this independently myself a decade ago since it’s true
master-lincoln•3mo ago
fucking marketing people screw us over on so many levels...
greatgib•3mo ago
Gpt4o shows the huge annoyance of the company/model being a moral judge of your requests and refusing quite often for anything negative.

It's like 1964 but corporate enforced. Now there are tasks that you are not allowed to do despite being legal.

In the same way, using gpt5 is now very unbearable to me as it almost always starts all responses of a conversation by things like: "Great question", "good observation worthy of an expert", "you totally right", "you are right to ask the question"...

holoduke•3mo ago
Try some of the Chinese models. Much less restrictive. With some obvious exceptions.
ACCount37•3mo ago
People gave Altman shit for enabling NSFW in ChatGPT, but I see that as a step in the right direction. The right direction being: the one that leads to less corporate censorship.

>In the same way, using gpt5 is now very unbearable to me as it almost always starts all responses of a conversation by things like: "Great question"

User preference data is toxic. Doing RLHF on it gives LLM sycophancy brainrot. And by now, all major LLMs have it.

At least it's not 4o levels of bad - hope they learned that fucking lesson.

Lerc•3mo ago
I have seen a few normally progressive types act quite conservative puritan over the NSFW ChatGPT thing. It seems there are quite a lot of people consider things to be uniformly good or bad and their opinion of the whole colours their opinion of the parts.

OpenAI are in a difficult position when it comes to global standards. It's probably easier to see from outside of the United States, because the degree to which the historical puritanism has influenced everything is remarkable. I remember the release of the Watchmen film and being amazed at how pervasive the preoccupation with a penis was in the media coverage.

kridsdale3•3mo ago
People in the US went ballistic over Mass Effect showing an outline of a butt, in the dark, for 1 second.
thinkingtoilet•3mo ago
Name me one piece of enterprise software that lets you do NSFW things. The way people jump to 1984 with no thought is double plus bad. ChatGPT is a piece of enterprise software. They are trying to sell it to large companies at large prices. This is not a rhetorical question, do you think if you could generate nude images of celebrities or picture of extreme violence, corporations would buy it? Having been a director at a Fortune 500 company that bought software, I can tell you with 100% certainty the answer is "no".
RobotToaster•3mo ago
> Name me one piece of enterprise software that lets you do NSFW things.

Photoshop, MS word.

drdeca•3mo ago
Technically? Microsoft Word certainly lets one write smut, and Photoshop certainly allows one to draw pornography? They won’t like, produce NSFW things automatically of course.
ryandrake•3mo ago
Exactly. Programs that don't let you do things based on the content should be thought of as weird/broken.

Imagine if we woke up tomorrow morning and grep refused to process a file because there was "morally objectionable" content in it (objectionable as defined by the authors of grep). We would rightly call that a bug and someone would have a patch ready by noon. Imagine if vi refused to save if you wrote something political. Same thing. Yet, for some reason, we're OK with this behavior from "certain" software?

drdeca•3mo ago
There is more than one way we could generalize the precedent previously set, imo.

None of the templates included with e.g. Word were for smut.

Word allowed you to type in smut, but it didn’t produce smut that wasn’t written by the user. For previous enterprise software, that wasn’t really a relevant question.

So… I don’t think it is obvious that the “Word lets you type in smut” implies “ChatGPT should produce smut if you ask it for smut.”

I guess precedent might imply “if you write some smut and ask it to fix the grammar, it shouldn’t refuse on the basis of what you wrote being smut”?

pants2•3mo ago
Companies like PH use full Enterprise stacks from AWS to Oracle. Hell, CloudFlare actively takes flack for running much worse websites like 8Chan, Daily Stormer, etc. and they are as enterprise-focused as it gets.
ipaddr•3mo ago
I can't think of any that restrict it. Sharepoint refusing an NSFW photo or Oracle refusing to store video isn't a thing.
lofaszvanitt•3mo ago
Seemingly they don't have tests to see whether on certain areas their model gets better or worse.
addend•3mo ago
Is there any AI image generator/editor that is good at creating graphics with transparent background? Nano Banana and some others output a white grey checkered background (fake transparency).
neurostimulant•3mo ago
There is https://leonardo.ai/transparent-png-generator/
smerrill25•3mo ago
Hey Addend! If you sign up brandimagegen . com, chatGPT image models create graphics w/ transparent backgrounds. You can also just rm backgrounds with my background remover tool.

I am biased on this since I built it and it officially launches on Friday 10/31

tezza•3mo ago
If you’re interested in side by side analysis of carious image gen tools, i review them:

https://generative-ai.review/2025/09/september-2025-image-ge...

indigodaddy•3mo ago
EDIT, looks like I didn’t click on the “image editing” tab when I went to the site, so I guess take the rest of my below comments criticizing the terminology with a grain of salt…

“Image editing” is a curious term, as it appears the site/topic is actually all about generating new images. The term in my mind should be for actual editing of existing, real, images, Eg “remove the coffee table” from this living room photo after uploading the image. I’ve found the actual “image generation” models to be bad at this because they introduce too many artifacts that weren’t in the original, which makes sense because they are really geared for creating images out of thin air.

Multimodal models like qwen3-vl-30b-a3b, however, seem to do quite well with editing existing images without trying to constantly add in new things or trying to change the image in ways that you don’t want, as if it’s trying to do the “lets just generate a new image” thing. imagegpt.com is also good for editing existing images, but not sure what model they are using on the backend.

biinjo•3mo ago
I don’t know if you and I are looking at the same site because all I see is existing images being edited with GenAI.

Input: bald man Prompt: give bald man hair Output: edited original, now with hair

That looks like editing to me.

Or are we strictly adhering to the ‘generating new images’ definition because these models technically recreate the entire image? It would be like editing a photo in Photoshop. If you hit “Save” you edited the photo. But if you hit “Save As” and create a new file, the photo wasn’t edited but created as a new image?

vunderba•3mo ago
I've actually gotten this comment a couple of times - perhaps I should make the nav bar at the top more prominently displayed.

WRT to Qwen3, is it possible that the API/site you were using was passing your "image edit requests" to something like Qwen-Edit [1] under the covers?

To my knowledge, Qwen3-VL (Vision Language) isn't capable of generating/modifying images - it's purely for doing reasoning about images.

[1] https://huggingface.co/Qwen/Qwen-Image-Edit

dangoodmanUT•3mo ago
To me this goes to show how far ahead Google is in the space.

The ability to clearly understand the image being edited, and make edits that look natural to that understanding, are far beyond any of the other models.

cristaloleg•3mo ago
Where is the famous "Don't draw a green elephant?"

UPD: suggested here https://github.com/scpedicini/genai-showdown-public/discussi...

smerrill25•3mo ago
Hey everyone! If you need a place to go test image models out against one another, please go to BrandImageGen .com:) Would love to see some signups as we have a pretty nice free tier:)