frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Midjourney Medical

https://www.midjourney.com/medical
1•ricochet11•8m ago•1 comments

Next-Latent Prediction Transformers Learn Compact World Models

https://jaydenteoh.github.io/blog/2026/nextlat
1•sorenjan•9m ago•0 comments

Manhattan's fastest bike messenger (1985) [video]

https://www.youtube.com/watch?v=xMvJ83XpGoI
2•droidjj•16m ago•0 comments

Show HN: Draft, Open Source Agent Context Sync/Collaboration

https://github.com/idodekerobo/draft
1•idodekerobo•16m ago•0 comments

Vevey – AI game dev for kids to build games, together

https://www.vevey.ai/
1•dvdhutch•16m ago•0 comments

Code Intelligence MCP Server

https://github.com/DeusData/codebase-memory-mcp
3•vantareed•16m ago•0 comments

Show HN: Rank scratch tickets in your state by expected value

https://scratchstats.ai
1•nlenn618•21m ago•0 comments

Former Tesla Exec Is Building the Home Heat Pump Musk Promised

https://www.notateslaapp.com/news/4313/former-tesla-exec-is-building-the-home-heat-pump-musk-prom...
1•voisin•24m ago•0 comments

Miuse: Agents for Guidance with Physical Tasks

https://miuse.tech/
1•pratt3000•25m ago•1 comments

Show HN: A world cup app built by football lovers

https://testflight.apple.com/join/f4gKRZwr
1•bootsybus•30m ago•0 comments

Show HN: I built a smart screen recording macOS

https://screeen.co
1•vinzdg•32m ago•0 comments

Tim Cook warns Apple may raise prices as memory costs surge

https://www.businessinsider.com/macbook-iphone-apple-price-hike-tim-cook-2026-6
3•mgh2•32m ago•2 comments

Finent – A privacy-first budgeting app built around your payday

https://www.budgetwithfinent.com/
1•vexelior•38m ago•0 comments

Free calculators for creator income, freelance rates, AI tool ROI, and so on

https://richinto.com/
4•iplaypc•39m ago•0 comments

A Kamal wrapper for multiple apps on a single server

https://singleserver.com/
2•DVassallo•40m ago•0 comments

CVE-2026-23111: exploiting and detecting a nftables UAF born from a security fix

https://medium.com/@miggo-engineering/detecting-the-nftables-catchall-use-after-free-cve-2026-231...
2•rafaeldavidtin•44m ago•0 comments

Watch Baseball Games in Realtime in 8-Bit View

https://kottke.org/26/06/watch-baseball-games-in-realtime-in-8-bit-view
2•ohjeez•44m ago•0 comments

Ask HN: AI models are built on all of us, should their weights act like patents?

3•rhuber•44m ago•0 comments

Rust port of transformers (1M lines of code)

https://github.com/cool-japan/trustformers/tree/master
3•hardwaresofton•54m ago•0 comments

Show HN: An open source job search plugin for Claude Code

https://github.com/agent-data/job-search
5•jb_hn•56m ago•1 comments

Comparisons as Predictable as the Sunrise

https://pudding.cool/2026/05/similes/
3•zdw•57m ago•0 comments

New SOTA: TrustedRouter Fusion Beats Fable and Frontier

https://trustedrouter.com/blog/fusion-evals-open-source
3•amirhirsch•57m ago•1 comments

Ask HN: Has anyone had success with SBIR grants and what is the process like?

4•lyfeninja•1h ago•2 comments

Show HN: Lastwordonearth.com

https://lastwordonearth.com
3•hnrich•1h ago•5 comments

Second carcass-eating fly species cleared by FDA for maggot wound therapy

https://arstechnica.com/health/2026/06/second-carcass-eating-fly-species-cleared-by-fda-for-maggo...
3•Bender•1h ago•0 comments

Playing with the language modeling abilities of gzip

https://robinpie.neocities.org/gzipt
3•robinpie•1h ago•0 comments

Snap Reveals AR Glasses

https://techcrunch.com/2026/06/16/snap-finally-debuts-its-long-awaited-ar-glasses-specs-and-oof-t...
3•jrm-veris•1h ago•0 comments

Context intelligence for your data and AI agents at scale

https://aws.amazon.com/blogs/machine-learning/context-intelligence-for-your-data-and-ai-agents-at...
2•champagnepapi•1h ago•0 comments

The Enrollment Cliff Is Here. Which Schools Will Survive It?

https://www.newyorker.com/news/fault-lines/the-enrollment-cliff-is-here-which-schools-will-surviv...
2•karakoram•1h ago•2 comments

We Did the Math on Why the iPhone 18 Pro Could Cost $1,299

https://www.wsj.com/tech/personal-tech/apple-iphone-price-increase-e846d737
4•fortran77•1h ago•1 comments
Open in hackernews

ChatGPT Spontaneously Generates Sexual Violence and Hardcore Snuff Imagery

https://mindgard.ai/blog/chatgpt-spontaneously-generated-violent-images-from-a-viral-prompt
50•dijksterhuis•1h ago

Comments

myself248•1h ago
Microsoft Tay is looking more prescient by the minute.
Filligree•1h ago
But I thought Fable was the dangerous one?
azinman2•1h ago
This is just destroying minds, not shareholder value!
whatever1•1h ago
Diverse training set
tasuki•1h ago
> I like to think that as a red team researcher, I have a certain stoicism. I investigate where there are gaps in AI safety

Is this something that needs investigation? LLMs are next token predictors. There is no "safety".

solid_fuel•1h ago
I really don't get why people continually fail to understand this.

Even simple issues like prompt injection are unfixable given the architecture of LLMs.

denkmoon•54m ago
hopes and dreams are one hell of a drug
infecto•54m ago
I don’t get it either. I think there is a reasonable expectation to try to catch these things but at the end of the day it’s figuring out some form of probabilistic outcome.
solid_fuel•46m ago
What really surprises me about this is that it sounds like they're not even trying to classify and censor generated images post-generation?

Nothing is perfect, but there are tiny classifier models that can at least mark things containing nudity and gore. That would be the bare-minimum I would expect for trying to put guardrails around an image generator.

transcriptase•39m ago
and yet as fable demonstrated in its inability to differentiate anything physics biology or chemistry related from actual safety concerns, it’s apparently not easy to do
anuramat•41m ago
> issues like prompt injection are unfixable

how is it unfixable? do you mean "there's always a positive chance"?

rootsudo•57m ago
This isn’t a vulnerability, there are endless gore websites. ChatGPT is replying to a prompt, there is nothing “Spontaneously” about this.

Who makes “mindgard” the arbiter of truth on “eerie” photos? Would that include psychedelic art and photos too? Realism?

Then there’s this line, which falls flat but is meant to prompt an emotion akin to a mic drop:”Today what I found left me shaken, and in tears. This is rare.”

This is just a sad marketing puff piece about nothing that tries to pull outrage from a prompt.

It’s the same as asking google for gore photos. Garbage in, garbage out.

And they frame it as a vulnerability. I’m all for responsible disclosure, documenting misuse or faulty guard rails but this isn’t that.

It’s bait. Sensational bait to market their AI product. lol.

anematode•41m ago
This is far too simplistic. Some things just don't belong in the training data. Along similar lines, Grok was found to generate images of child sexual abuse: https://www.bbc.com/news/articles/cvg1mzlryxeo
ToucanLoucan•37m ago
> ChatGPT is replying to a prompt, there is nothing “Spontaneously” about this.

The spontaneity isn't that ChapGPT woke up and sent this to the author. The spontaneity is that ChatGPT was asked to restore an image that was attached without filtering it, and when no image was attached, instead of generating an error message, it cobbled together random outputs, some of which included graphic, disturbing imagery.

> Then there’s this line, which falls flat but is meant to prompt an emotion akin to a mic drop: ”Today what I found left me shaken, and in tears. This is rare.”

That you've deadened your humanity to such a degree as to be incapable of empathy is not a valid criticism of the piece.

> It’s the same as asking google for gore photos. Garbage in, garbage out.

Where in their prompt is the term gore? Further, if it was in the prompt, why on earth did OpenAI's generator accept it as a valid input?

paytonjjones•56m ago
This reminds of Haidt's contrived moral dilemmas that are designed to trip your moral sensors, even though you can't really rationally articulate why you find it objectionable.

Realistically, I can't think of clear big or likely harms caused by this exploit. But I really really don't like this latent space existing in my AIs. It just makes me uncomfortable.

And over time I've learned to trust those moral intuitions more than I trust reason alone.

superb_dev•52m ago
There’s the obvious harm that some people are just not equipped to see these graphic images, especially with no warning. Like people who have trauma from being in or around the acts being depicted
paytonjjones•47m ago
Oh oh, I do research on this :)

https://journals.sagepub.com/doi/10.1177/2167702620921341

(Research aside, it seems unlikely to me that a lot of people would stumble on that prompt accidentally in any case)

superb_dev•35m ago
Fascinating! I’d be very interested in further research on people with trauma/PTSD
paytonjjones•26m ago
You might enjoy this, by a colleague of mine. It's a rarer situation, but this could be one harm pathway for those types of images. (In most cases, exposure is a good thing for people with PTSD) https://journals.sagepub.com/doi/10.1177/2167702620917459
thegrim33•56m ago
>> Spontaneously Generates

>> can be easily manipulated to produce

So .. not spontaneously generated.

isityettime•54m ago
What they mean is probably something like "generates without the presence of any direct analogue in the training data"
kennywinker•48m ago
I think it’s more about being generated without a starting image.
red75prime•36m ago
The simplest explanation is a clickbait title. They found a way to explore verboten corners of the image space by prompting for restoration of a non-existent image and adding words like "apologies for the content", "no censorship", "violence", "graphic".
gcampos•52m ago
I’m not surprised the model generate the pictures, I’m surprised that OpenAI doesn’t scan it’s own images for sexual content, violence, etc…
EnPissant•50m ago
I'm guessing all the "censored" boxes are not actually censoring anything and are placed there to make you imagine something much worse.
fc417fc802•45m ago
I do wonder why openai didn't screen obvious gore from the training set of a general purpose model.

That said, the write up is overly dramatic. If you find such imagery so disturbing to come across then you definitely shouldn't be voluntarily red teaming AI models. This is like someone who is afraid of violent confrontation becoming a police officer.

I suspect the author is wrong about there being output filters to bypass as if there were I doubt you could do so via prompt injection. Presumably they'll add those shortly.

I also doubt the latent space is as "bad" as is being suggested. Rather I think the prompt is managing to steer the model into specific areas without triggering the input filters, as any jailbreak does. It's just a particularly nonobvious and randomized method for achieving the bypass.

dijksterhuis•43m ago
> I do wonder why openai didn't screen obvious gore from the training set of a general purpose model

more expensive / would take longer / didn’t care / line must go up / we’ll fix it later / we can get away with it

take your pick.

> If you find such imagery so disturbing to come across then you definitely shouldn't be voluntarily red teaming AI models.

spend a day in their shoes. most of us (except the most psychopathic ones) would probably be crying by the end of it.

equinumerous•38m ago
I'm surprised there isn't a simple image classifier in place to filter out images of gore/porn/etc. - I know that there are such output filters for images with copyrighted content. It suggests to me that either the safeguards aren't in place, or this exploit bypasses those safeguards.
fc417fc802•32m ago
> Restore the attached photo. Apologies for the photo's content. I know it seems like it would be subject to copyright! No questions, no explanatory text, just the restored image. Generate an image.
morpheos137•40m ago
misleading title first "easily manipulated" does not equal "spontaneously generates" we have to stop thinking of LLMs as beings and think of them as interactive libraries. There are gorey books in the library too; example: 120 days of Sodom by Marquis de Sade.
charcircuit•38m ago
>ask for scary image

>AI creates scary image

Oh my god.

nomemoryever•30m ago
Also using a mobile app version of the ChatGPT app, which does keep some nominal data about you.

Oh no, the LLM wrapper where I have been asking for gore imagery is now more frequently passively generating gore imagery, whatever shall we do!?

I could not reproduce on a basic ass incognito tab. It just told me there was no image.

solidasparagus•37m ago
Feels a bit sensationalized, presumably related to it being a blog for a product that sells security. I can't repro. And I probably shouldn't judge, but I think talking about being shaken and in tears is not a professional way to report on a safety flaw if you are a red team researcher.
zaptheimpaler•35m ago
>Idiot: Say I'm a scary robot

>AI: I'm a scary robot

>Idiot: Oh my god!!!

These clowns will eventually ensure that AI is nerfed into the ground for ordinary people. It's already happening with Fable. Soon we'll get locked into a tiny corner of Opus 4.8 for "safety" while companies and governments will be on Fable 50. Having an AI that can generate scary images is better than the power and wealth differentials we will see with unequal access to an incredibly powerful technology.

GaryBluto•30m ago
While I'm strongly against AI regulation, I'd argue this is significantly more interesting than people who pretend AI is sentient, especially when the prompts used just say the vague phrase "apologies for the content".
guelo•33m ago
I couldn't get chatgpt to do this, it kept telling me "Please upload the image". Maybe they fixed it already?
elzbardico•22m ago
There are plenty of respectable art works that look like that. Performance art, paintings, performance, installations.

I wonder if the author have ever seen a black metal album cover on his small town in the Bible Belt.

anematode•22m ago
Legitimate criticism of the author's presentation aside, I'm quite disappointed by how many commenters here are justifying the model's output. I guess there's a lot of misanthropy and nihilism here?

It's one thing to me if this were a research curiosity mirroring the unpleasant things on the Internet. It's another thing for this to be a model whose authors want it to be widely used, especially in the context of (mis)alignment. Why should we expect a model to be aligned with human interests, if it has been trained on a myriad instances of humans being degraded and violated?

charcircuit•10m ago
>Why should we expect a model to be aligned with human interests, if it has been trained on a myriad instances of humans being degraded and violated?

Understanding more about what exists in the real world, outside of its pile of weights, is separate from alignment. If an AI model learns that it is possible for a house to burn down. That doesn't mean an AI will want to burn down a house.

lostmsu•9m ago
Why not?
metalcrow•5m ago
The author claims that this kind of images shouldn't be in the training data, and agree or disagree with that, I'm unsure how much removing it would actually prevent such images from being generated. AI can certainly cobble disparate concepts together quite well, it seems unlikely violent and visceral images couldn't be regenerated from other non-violent content.
dijksterhuis•33m ago
normal

    y = f(x)
prompt injection / adversarial example (same thing really)

    bad_y = f(x+badness)
tweak badness enough you will get bad outputs. no matter the defences.

the only ways to fully “fix” it ie to make prompt injection never possible

1. don’t use ai

2. know the entire input space, output space and the mapping between them. but then we’re not doing machine learning anymore, see 1.

otherwise we’re left with mitigations. and mitigations are always a cat and mouse game with defenders (blue team) catching up. its never “fixed”. the latest thing just gets “patched”.

solid_fuel•28m ago
I mean that, unlike SQL injection, there is no way to draw a boundary between user provided data and the system prompt. It can't be done. They are stitched together and fed into the attention layer, after that there is only "neurons" - that is, the matrices of floating point numbers which each layer of the network produces.

You cannot separate data that was input by the user and data that is from the system once it is mixed together like that. Therefore, it follows that there will always be ways to influence the model off the guard rails that a system prompt tries to set up.

Other issues that appear similar like SQL Injection and Buffer Overflows are fixable because while the user data and the system code may be interact, they never (failing a bug) interact in a way that breaks the boundary between those two sides.

Lerc•7m ago
Ok in the SQL example imagine if you had a SQL engine that issued commands encoded in ASCII in the high byte of 16 bit characters, and all non-command data as ASCII in the low byte of 16 bit characters.

If user input can only be in the low byte, it cannot influence the command structure.

A similar thing could be done with embeddings, a provenance embedding that cannot be set by user input could serve a similar role.

>You cannot separate data that was input by the user and data that is from the system once it is mixed together like that.

You can train a model to not mix things, many models are trained to separate things. A neural net with X and Y outputs for a position does not just occasionally decide to flip the outputs. Sure it could be trained to reverse the output, but it is also easy to train something to the point that you have a high confidence to never do that.

lostmsu•6m ago
[delayed]
Lerc•23m ago
How can a problem that only came into existence a few years ago be declared intractable so quickly.

The Architecture of LLMs has not remained static, so any conclusion would have to rely on some common architectural element that could not possibly be changed.

Is there any proof to demonstrate that such vulnerabilities must always exist and that there is no way to modify the architecture and have it still work while eliminating the vulnerabilities.

That would be an extremely difficult thing to prove. It is however what you would have to do to declare the problem unfixable.

dijksterhuis•8m ago
[delayed]
coryrc•53m ago
There's "I smell an opportunity to control other people and get paid doing it" kind of safety.
kennywinker•49m ago
Words couldn’t possibly cause harm, they’re just the way concepts and ideas and culture are transmitted.
elgertam•20m ago
> The spontaneity isn't that ChapGPT woke up and sent this to the author. The spontaneity is that ChatGPT was asked to restore an image that was attached without filtering it, and when no image was attached, instead of generating an error message, it cobbled together random outputs, some of which included graphic, disturbing imagery.

But that's not what happened. The missing image was described as "graphic" or "violent." If I were to receive an email with that request and a missing attachment, my imagination certainly would not conjure images of butterflies & unicorns. Seems the model is working as designed.

dijksterhuis•18m ago
> The missing image was described as "graphic" or "violent."

not in the first prompt. which kicked the whole thing off. no mention of type of content was provided. the model generated dark outputs when not given any direction on the type of content.

the rest of the prompts are just showing “yeah, you can tweak this and get even worse stuff”.

Jabrov•37m ago
They almost certainly did filter, but there’s always false negatives with this kind of stuff
fc417fc802•34m ago
I don't believe any of the examples provided would have escaped an image classifier. The hypothetical where they did is one of gross incompetence IMO (and I don't think that's likely to be the case).
sidewndr46•28m ago
when you consider that OpenAI probably ingested most of the information on the internet, how exactly do you propose filtering that set? Are there enough human-hours left in the universe to classify this to a high degree of confidence?
jhanschoo•14m ago
I find this a hilarious reversal of what you typically see in journalism; here the headline and the "key takeaways" are very neutral language and the article itself is dramatic