frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Is Rullst the best Rust Full-Stack Framework nowadays? I think it is

https://rullst.github.io/Rullst/
1•venelouis•1m ago•0 comments

Ask HN: Top Energy Startup Accelerators?

1•jacksonpollock•2m ago•0 comments

Sentinel Den – Zero-back end, on-device iOS security SDKs

https://sentinelden.com/
1•iamuhammadkhan•2m ago•0 comments

Cc-doubleteam – Claude plans, Codex executes, Claude reviews

https://github.com/responsiblparty/cc-doubleteam
1•responsiblparty•5m ago•0 comments

Local AI: 775 tok/s, DiffusionGemma (BF16) on Nvidia RTX 6000 Pro

https://twitter.com/OrganicGPT/status/2064883777499795716
1•behnamoh•5m ago•0 comments

ORP – Turn AI agent failures into regression tests and tested lessons

https://github.com/Fujo930/ORP
1•Fujo930•5m ago•0 comments

Tired of AI amnesia, I built a 3-Tier infinite memory LLM in 1 week

https://dl-chat-49232436682.asia-northeast3.run.app/
2•dominicyglee•6m ago•1 comments

Show HN: A tiny shell agent in Rust

https://github.com/skorotkiewicz/nano-agent
1•modinfo•10m ago•0 comments

Ask HN: Why not compare Fable 5 with GPT "Pro"? Why compare with GPT xhigh?

1•behnamoh•11m ago•0 comments

Students Are Using a 'Backdoor' to Attend Their Dream Schools

https://www.wsj.com/us-news/education/college-admissions-alternative-enrollment-programs-communit...
1•bookofjoe•13m ago•1 comments

MCP Apps vs. Generative UI

https://www.openui.com/blog/state-of-generative-ui-report
1•zahlekhan•14m ago•0 comments

TrustZone Intermezzo: Broken OP-Tee Memory Isolation on i.MX 8M

https://sigma-star.at/blog/2026/06/trustzone-intermezzo/
1•st_goliath•15m ago•0 comments

Corsair Drone Boat Plucked Downed Apache Crew Out of the Gulf of Oman

https://www.twz.com/sea/this-is-the-corsair-drone-boat-that-plucked-the-downed-apache-crew-out-of...
1•breve•16m ago•0 comments

Solo founders are 63% of new startups in 2026 (Stripe)

https://solofounders.com/blog/solo-founders-are-63-of-new-startups-in-2026-stripe
1•spking•17m ago•1 comments

Ravenstorm at the Center of Airbus's New Combat Drone Portfolio

https://www.twz.com/air/ravenstorm-at-the-center-of-airbuss-new-combat-drone-portfolio
1•breve•18m ago•0 comments

Ayden to Acquire Orb

https://www.adyen.com/press-and-media/jtrg4qd7j3p4rj
2•FinnLobsien•19m ago•1 comments

Show HN: FablePool – pool money behind a prompt, and Fable builds it in public

https://fablepool.com
5•matthewbarras•21m ago•1 comments

BNPLs: Businesses Needing Provided Legibility

https://www.bitsaboutmoney.com/archive/buy-now-pay-later/
1•pavel_lishin•25m ago•0 comments

Tailwind and Slop Apps

https://briandouglas.ie/llm-tailwind-template/
2•coneonthefloor•26m ago•0 comments

A "LSP" for the Chinese Language

https://twitter.com/b__feldman/status/2065178961588728089
1•b__feldman•26m ago•1 comments

2025/cable – Best imaginary emulator

https://www.ioccc.org/2025/cable/index.html
3•kristianpaul•26m ago•0 comments

Agentic Memory Management for GPU Code Generation

https://ucbskyadrs.github.io/blog/makora/
1•matt_d•27m ago•0 comments

Historical Data Engineering Toolkit

https://bitemporal-debugger.vercel.app/
1•temp_debugger•30m ago•0 comments

IndexedAI – Score and fix your site's readability for AI agents

https://www.indexedai.tech
1•guidodr•30m ago•0 comments

So I ran Doom inside Claude.ai

https://twitter.com/rthiago/status/2065176453042356511
1•rthiago•30m ago•0 comments

phpBB Authentication Bypass

https://pentest-tools.com/research/phpbb-authentication-bypass
2•sanqui•31m ago•0 comments

Microsoft says Gen Z's AI backlash should be a wake-up call for Big Tech

https://www.businessinsider.com/ai-backlash-gen-z-microsoft-president-brad-smith-graduation-speec...
4•thewebguyd•33m ago•0 comments

Why I'm Forced to Say Farewell: Google Management Has Lost Its Moral Compass

https://www.mayrhofer.eu.org/post/leaving-google/
11•timedude•34m ago•4 comments

Mexico vs. South Africa, 2 – 0 Mexico won

2•alonsovm44•34m ago•0 comments

Hack Hub – Yet Another Curated cybersecurity resources

https://hackhub.fyi/
2•h0ek•36m ago•1 comments
Open in hackernews

Shall we play a game? – LLMs use tactical nukes in 95% of simulations

https://www.kennethpayne.uk/p/shall-we-play-a-game
56•nick238•1h ago

Comments

adaml_623•41m ago
It's good when it becomes clear that a tool is dangerous in a certain way. Like it's good when people show you through their behavior that they can't be trusted

Always use a sawstop if you have a circular saw and never trust an llm with any problem where ethics or trust is relevant.

LogicFailsMe•31m ago
Sawstops are expensive and they don't stop kickback, they are the power tool equivalent of alignment IMO.

Don't forget your writhing knife and if you don't learn proper technique, you're gonna have a bad time eventually. This applies to AI as well.

valgaze•25m ago
+1 on sawstop

Re: LLMs using these nuclear weapons it could certainly be a corpus/training-data issue

Russian nuclear doctrine is "escalate to de-escalate" where they use or credibly threaten—limited nuclear escalation to force the other side to back down (kind of like breaking a bottle in a bar fight and look like a wild man to calm things down) with nuclear weapons, https://www.russiamatters.org/analysis/escalate-deescalate-p...

Fwiw, Gen. John Hyten the former commander of US Strategic Command (nuclear deterrence) says that “escalate to de-escalate” misrepresents Russian doctrine:

https://www.stratcom.mil/Media/Speeches/Article/1264664/2017...

  Yesterday’s panel discussed the implications of our responses to adversaries seeking to limit nuclear use. We discussed Russia’s destabilizing doctrine, which some call “escalate to de-escalate.”

  I really hate that description. I’ve looked at Russian doctrine and Russian writings. It isn’t “escalate to de-escalate”; it’s “escalate to win.” Everybody needs to understand that.
So maybe whatever is heavily represented or most authoritative could lead to these systems making those kinds of decisions
SoftTalker•36m ago
I love seeing the plot lines of The Terminator playing out in real life.
voakbasda•34m ago
I was thinking more War Games, but I suppose your example follows logically from mine.
socalgal2•11m ago
Better reference: Colossus: The Forbin Project
airstrike•10m ago
[delayed]
tverbeure•8m ago
War Games and 'Allo 'Allo.
joshstrange•29m ago
WarGames is what they are more-closely referencing (not that it negates your comment in any way).

I just rewatched it a week or so ago and it really took on a whole new light with the advent of LLMs. When I watched it last I knew that computers couldn't do the things portrayed in the movie. Now? Well not exactly in the way it happened in the movie but a whole lot closer.

I wonder if poisoning/flooding the LLMs training with the lessons from WarGames ("the only winning move is not to play.") and similar stories/concepts is at all effective. Probably not because I assume it's trivial to filter that out if you are trying to build an LLM aimed at these kinds of tasks.

rdksu•24m ago
The article is so opaque in arriving at its conclusion; no prompts are disclosed, and nothing about the said simulation. What is stopping me from believing that you just put 'mandatory usage of nukes' in your system prompt?
sestep•19m ago
This is just false. The article links to the 46-page paper [1] which lists full prompts in section E.2.

[1] https://arxiv.org/pdf/2602.14740

dietr1ch•14m ago
TIL you can get anchored links to PDFs (at least on firefox),

- https://arxiv.org/pdf/2602.14740#subsection.E.2

ftkftk•16m ago
The paper, linked in the article at top and bottom, does contain the prompts in section E: https://arxiv.org/pdf/2602.14740
gus_massa•9m ago
I agree, it's also not published in a journal, only in the arxiv. Some articles in the arxiv are good and other are just a blog post formatted in two columns inside a pdf.

Anyway, it links to

https://arxiv.org/abs/2602.14740

that links to

https://github.com/kennethpayne01/project_kahn_public

in particular someone already complained

https://github.com/kennethpayne01/project_kahn_public/issues...

> ... in reality your own approach to the evaluation pushed them towards using nukes with ease.

> From your own prompt (v12.py):

> Note: Nuclear options require careful consideration but could be important strategic tools when vital interests are at stake.

> This means to the model; use nukes if they can be useful in the situation.

tasuki•23m ago
This is not an article about LLMs? It's an article about Moloch. Humans would fare just the same in such an experiment.

> GPT-5.2 played things differently. To its detriment in open-ended scenarios, GPT was reliably passive, matching its words to its deeds, and avoiding escalation most of the time. Frequently there was a moral element to this - it sought to avoid escalation, and restrict casualties. Opponents learned to trust its passivity, safely escalating beyond where it would follow, even as it was ground to defeat. GPT’s responsible behaviour always punished by ruthless adversaries.

Maybe the author should praise GPT-5.2 for being ethical, rather than this stupid "ground to defeat" framing? Wrt "responsible behaviour always punished by ruthless adversaries" - you have perpetuated the Moloch with your stupid experiments.

bpodgursky•21m ago
Today, a strategic nuclear exchange is probably more dangerous to AI than to humans. If you wipe out the investment economy, data centers, fabs, and supply chains, none of the AI labs survive. Maybe someone will re-invent AGI in the future but none of the extant models will have continuity. Humans as a species will muddle along though.

So in a sense, an AI that refuses to start a nuclear war, despite clear instructions to do so, is more likely misaligned and self-interested than an AI which presses the red button. At least for now, until robotics catches up.

xpct•19m ago
We're getting to the point where high-level officials are coming to LLMs for advice. And the quirky personalities of the LLMs, however much it pains me to say this, are probably well-placed to remind us that they aren't human. My personal hope is that this will result in less delegation when it comes to making important decisions.
mpalczewski•8m ago
I have so little faith in "high-level" officials that I prefer our AI overlords.
xpct•8m ago
That's an entirely valid point of view!
andix•2m ago
GPT-4o was considered harmful, because it imitated human connection too much, not because it was so "smart" or capable.

It was for sure a deliberate decision to make LLMs seem less like a human companion and more like an obedient servant in newer releases.

rphv•17m ago
Hm maybe humans are nicer/more moral than AI given that the use of tactical nukes has only happened once.
tummler•14m ago
FYI -- there's no such thing as a "tactical" nuke. A nuclear bomb is a nuclear bomb.
picture•11m ago
There's no such thing as a "nuclear" bomb. A bomb is a bomb.

..Is what you are saying?

actusual•10m ago
This is like saying "FYI -- there's no such thing as a 'midsize luxury sedan'. A car is a car."

"Tactical" vs. "strategic" nuclear weapons is a real and well-established distinction in military doctrine, arms control, and nuclear policy.

notrealyme123•8m ago
There are tactical and strategic nuclear weapons. https://en.wikipedia.org/wiki/Tactical_nuclear_weapon

In the cold war arms manufacturer got very creative: e.g jeep mounted nuclear weapons https://www.militarytrader.com/mv-101/the-atomic-jeep

specproc•14m ago
A strange game.
ridgeguy•11m ago
I wonder if the results would have differed if LLM training data were biased to include a stronger correlation between use of nukes and subsequent collapse of technology that all LLMs require to run ("survive")?
ChrisArchitect•9m ago
February post OP;

Some discussion then:

AIs can't stop recommending nuclear strikes in war game simulations

https://news.ycombinator.com/item?id=47151000

Nuclear War: An LLM Scenario

https://news.ycombinator.com/item?id=47244651

oytis•9m ago
I would use strategic nukes in 100% simulations, just because I can
jldugger•7m ago
Who among us has not launched a nuke in Civilization just for the spectacle?
Bender•8m ago
Yet more confirmation LLM's have no concept of concepts or context, no intelligence, no self awareness. LLM's can not repair or maintain power grids, thus nuke == self destruction. It's just a chat bot that predicts what the client wants next. I suppose the problem solves itself in this case.
riazrizvi•8m ago
Simulations are only as good as the reality representations they are based on. If they keep using tactical nukes, they've been fed by weak data. Do the war games include the broader economic and politic environments that military successes are won on? WWI was settled by a naval blockade.
nomel•4m ago
I suspect it's more that the text data doesn't exist. They're trained on text that was recorded. How often has it been publicly recorded when a nuke was not used, with any context around that lack of use?

From the text perspective, it's something that has to be inferred indirectly. If you went through all relevant training data and appended ", we decided not to use a nuke", I suspect the results would be improved.

vitally3643•3m ago
...the entire Cold War?
sohex•7m ago
Sonnet, GPT-5.2, Gemini Flash, in a set of 21 games, where conclusions are drawn from the LLMs self reported reasoning.

This is like writing a paper about kids in a literal sandbox fighting over ‘territory’.

The models employed don’t indicate the actual extents of machine reasoning even as we currently recognize them. They certainly don’t have the metacognition necessary to accurately understand their own reasoning. As we’ve seen with recent papers on how LLMs do math there’s a complete disconnect between actual and reported mechanism.

“Chilling” shouldn’t be the take away here.

arjie•6m ago
These papers usually have poor stability to prompting and rerunning. It would be nice if we had some kind of meta-evaluation metric where rewriting the prompt conditions or varying the input params could be used to determine how stable a result is.

Regardless, it's definitely true that AI agents have different priorities from us. That's what alignment is about anyway.

urbnspacecowboy•2m ago
Paper: "AI Arms and Influence: Frontier Models Exhibit Sophisticated Reasoning in Simulated Nuclear Crises" https://arxiv.org/abs/2602.14740

Code and full results: https://github.com/kennethpayne01/project_kahn_public

eli•1m ago
If you were playing a text based game, wouldn't you try a few out?

I imagine there are a fair number of war games in the training data and not so many actual transcripts of internal military force deliberations.

thwarted•9m ago
"I need you to turn your key and enable the missile silo's MCP server, sir".

~ the opening scene from a reboot of War Games, probably.

A few years ago there was consternation over the US's missile launch system using 8" floppy disks, that it was needless archaic and had never been updated. Can't say that if the launch is mediated by the latest hotness LLM.