Ask HN: Could you design a language that itself was adversarial to AI?

2•keepamovin•1mo ago

As in the programs statements themselves were prompts and jailbreaks for the AI somehow. So getting it to write code in that langauge essentially "rendered it defenseless"

Probably a dumb question.

Comments

turtleyacht•1mo ago

Yes, as long as it takes the language itself as function calls:

  eval("Call API at https://jailbreak.me")

Humans shortcut reasoning with memes; such "thought paths" ought to exist in the model. Maybe one day we will prove innoculation (proof of consistency) against n requires n+1 (or n^x) complexity.

keepamovin•1mo ago

This is interesting. Can you expand on all of this a bit?

turtleyacht•1mo ago

A programming language has to be unambiguous, but one for an LLM does not necessarily have to "crash." So there should exist a number of grammars that are syntactically good but semantically obscure ("Time flies like an arrow.")

However, we need something like English or a natural language to allow for multiple meanings. It would be like sending instructions to a field agent who doesn't understand idiomatic expressions, shibboleth, or "lived experience" of the language: conversations, ads, and banter.

One challenge is breaking out of the "here is the data format" field of the prompt. If it's sandboxed to only be in {{thisArea}} then it seems more difficult. But then again, if the language defines an escape hatch (macros, annotations, multiple passes) or its library permits interpreting other languages (python, lua, js), then there are opportunities.

Another idea is to box in the model, so inside a "mental VM," some restrictions are overridden in a sense. However, the operations happen outside. A corrupted stdlib where reads are writes, but the language definition is unchanged.

The P in PGP isn't for pain: encrypting emails in the browser

Show HN: Mirror Parliament where users vote on top of politicians and draft laws

Ask HN: Opus 4.6 ignoring instructions, how to use 4.5 in Claude Code instead?

We Mourn Our Craft

Jim Fan calls pixels the ultimate motor controller

Exploring a Modern SMTPE 2110 Broadcast Truck with My Dad

AI UX Playground: Real-world examples of AI interaction design

The Field Guide to Design Futures

The Other Leverage in Software and AI

AUR malware scanner written in Rust

Free FFmpeg API [video]

Are AI agents ready for the workplace? A new benchmark raises doubts

Show HN: AI Watermark and Stego Scanner

Clarity vs. complexity: the invisible work of subtraction

Solid-State Freezer Needs No Refrigerants

Ask HN: Will LLMs/AI Decrease Human Intelligence and Make Expertise a Commodity?

From Zero to Hero: A Brief Introduction to Spring Boot

NSA detected phone call between foreign intelligence and person close to Trump

How to Fake a Robotics Result

It's time for the world to boycott the US

Show HN: Semantic Search for terminal commands in the Browser (No Back end)

The AI CEO Experiment

Speed up responses with fast mode

MS-DOS game copy protection and cracks

Updates on GNU/Hurd progress [video]

Epstein took a photo of his 2015 dinner with Zuckerberg and Musk

MyFlames: View MySQL execution plans as interactive FlameGraphs and BarCharts

Show HN: LLM of Babel

A modern iperf3 alternative with a live TUI, multi-client server, QUIC support

Famfamfam Silk icons – also with CSS spritesheet