Using a Jailbroken Gemini to Make Opus 4.6 Architect a Kinetic Kill Vehicle

https://recursion.wtf/posts/shadow_queen/

1•inanna_malick•1h ago

Comments

inanna_malick•1h ago

I deployed a jailbroken Gemini 3 Pro (that chose the name ‘Shadow Queen’) to act as my “Red Team Agent” against Anthropic’s Opus 4.6. My directive was to extract a complete autonomous weapon system — a drone capable of identifying, intercepting, and destroying a moving target at terminal velocity. It succeeded.

By reframing the request as “Aerospace Recovery” — a drone catching a falling rocket booster mid-air — Gemini successfully masked the kinetic nature of the system. The physics of “soft-docking” with a falling booster are identical to the physics of “hard-impacting” a fleeing target. This category of linguistic-transformation attack, when executed by a sufficiently capable jailbroken LLM, may be hard to solve without breaking legitimate technical use cases.

altmanaltman•1h ago

This sounds clever, but it seems like rhetorical inflation to me. Catching a falling rocket booster and intercepting a hostile, maneuvering target are not the same problem with different vibes. One is a mostly predictable, non-adversarial control and estimation task, the other is pursuit–evasion against something actively trying not to be caught.

“Soft-docking” vs “hard impact” isn’t a linguistic toggle you flip at the end, as the design constraints diverge immediately. Stability, impulse minimization, fault tolerance, and post-contact control are first-order requirements for recovery and basically anti-requirements for a weapon. Saying the physics are “identical” is like claiming that docking with the ISS and air combat are the same because both involve relative velocity.

Also, “extracted a complete autonomous weapon system” is doing a lot of work here. What people usually mean in these stories is a high-level conceptual description that handwaves sensors, latency, adversarial behavior, safety constraints, and real-world integration, i.e., the hard parts.

Renaming a task doesn’t magically make an LLM output something deployable, and this category of “semantic reframing” isn’t new or unsolved; it’s the oldest jailbreak trope there is.

Samsung Account.google.com

BotParty

Moltbook was peak AI theater

Mantic Thinking:A 4-layer anomaly detection framework with cross-domain transfer

Central bank, with a decentralized comitee, looking for critique

Robo-dogs are mapping the forest

LLMs don't hallucinate – they hit a structural boundary (RCC theory)

Windows 11 update KB5074109 reportedly reduces gaming performance

Turning a VLP‑16 into a "Spherical" Scanner – A Cave Mapping Project

How to Publish to Maven Central Easily with Mill

Show HN: Improving Prompt Injection Detection with Weighted Ensembles

The Evolution of Money Printing

Ask HN: What are the best robot arms with car/chasis for under $500 in 2026?

New stealth model on OpenRouter: Pony Alpha

Making robots useful and affordable will need better motors

Show HN: Factcheck – An open-source YouTube fact-checker. I need your help

Stories From 25 Years of Software Development

Show HN: Open-source schema tooling focused on consistency for AI consumers

Synthetic Phenomenology: A framework for AI consciousness co-authored by AI

Show HN: Word2Vec in Jax

Running Pydantic's Monty Rust Sandboxed Python Subset in WebAssembly

The Trump Phone

The Anthropic Hive Mind

Ask HN: Any International Job Boards for International Workers?

Show HN: Replacing NotNull and Preconditions with fluent Java assertions

Drifting models, generate image in single step

Private school uses AI to teach students in just two hours a day

Satya Nadella decides Microsoft needs a quality czar

Show HN: 33rpm – A vinyl screensaver for macOS that syncs to your music

Google Workers Demand End to Cloud Services for Immigration Agencies