Ask HN: Are we ready for vulnerabilities to be words instead of code?

2•lielcohen•2h ago

Until now, security has been math. Buffer overflows, SQL injections, crypto flaws — deterministic, testable, formally verifiable.

But we're giving agents terminal access and API keys now. The attack vector is becoming natural language. An agent gets "socially engineered" by a prompt; another hallucinates fake data and passes it down the chain.

Trying to secure these systems feels like trying to write a regex that catches every possible lie. We've shifted the foundation of security from numbers to words, and I don't think we've figured out what that means yet.

Is anyone thinking about actual architectural solutions to this? Not just "use another LLM to guard the LLM" — that feels like circular logic. Something fundamentally different.

(Not a native English speaker, used AI to clean up the grammar.)

Comments

nine_k•1h ago

Scams and "social engineering", as known for a long time, could be a good approximation.

lielcohen•1h ago

Right, but with scams you trick a human into doing something. With agents, you give them the keys upfront - terminal, file system, API keys - because otherwise what's the point? You can't have an agent that asks permission for every action, you'd just be babysitting it all day. So the question isn't "how do we stop someone from being tricked." It's "how do we secure something that already has root access and runs on vibes instead of logic."

codingdave•1h ago

Don't give it root access.

That answer hasn't changed since day one of LLMs, despite some of the thing people are attempting to build these days: If you don't want to get in trouble, don't give LLMs access to anything that can cause actual harm, nor give them autonomy.

lielcohen•1h ago

Sure, that works today. But Meta is cutting 20% of its workforce. So is everyone else. The whole bet is that agents replace human work - and that only works if they can actually do things. Deploy, access databases, call APIs.

"Don't give it access" is like saying "don't connect to the internet" in 1995. The question isn't whether agents get these permissions. They will. The question is what happens when they do.

nine_k•18m ago

Let's see how well it works for them. Apparently Salesforce had been a bit overly enthusiastic about layoffs, and recently had to backtrack.

nine_k•14m ago

How do we expect that everything goes all right if we give prod access to a pack of very smart dogs that know some key tricks? Now the same, when humans actually leave the room?

My answer is simple: it just won't be all right this way. The problems will cost the management who drank too much kool-aid; maybe they already do (check out what was happening at Cloudflare recently). Sanity will return, now as a hard-won lesson.

lielcohen•1h ago

To be clear - I'm not really talking about my personal laptop. I'm thinking about where this is going at scale. When companies start replacing entire teams with agents (and looking at the layoffs, that's clearly the direction), those agents will need real access to production systems. That's the scenario where "just don't give it access" stops being an answer.

The Day I Discovered Type Design

Anthropic's "Claude for Open Source" program still charged $200

Walmart wins patents to give algorithms more sway over prices

Bombarding gamblers with offers greatly increases betting and gambling harm

Open Source Gave Me Everything Until I Had Nothing Left to Give

The Stochastic Parrot Argument Considered Harmful

Black Cube: Israeli spy firm crashes Slovenia's election

Randevu: Deterministic Schelling Points for Decentralized Temporal Coordination [pdf]

SpaceX Knocks Boeing from Dominant Role in NASA's Moon Mission

Long dismissed in adult health, the thymus may be critical for longevity

You're probably overpaying for everything you buy online

Show HN: Groq Emulator

Forked Garry Tan's gstack and adapted for Google's Antigravity and Gemini-CLI

I Spoke to AI Agent Claude – Sen Bernie Sanders

ShouldIBuildThat finds app opportunities that appear across multiple signals

Building a UI Framework [pdf]

IdeaClaw – one sentence, get a camera-ready paper, BP, DD reports, health report

What's in a name? – The unknown faces of history

Making an Argument for (Voluntary) Online Identity Verification

To Catholic thinkers, Pentagon's AI demands violate 'human dignity'

I built a database scoring what separates high-scoring pitch decks from the rest

House speaker, Intel chiefs make new push to renew surveillance law

Replacing Anki: what I learned building a language app (1k users, $21 MRR)

Agent-rendered: the pattern that replaces runtime infra with build-time AI

Vulnerabilities in OpenClaw: A Complete Enterprise Security Analysis

Minecraft Source Code Is Interesting

AI Pentester

Update iOS to protect your iPhone from web attacks

New "PolyShell" flaw allows unauthenticated RCE on Magento e-stores

Generalized Dot-Product Attention: Tackling Real-World Challenges in GPU Kernels

The Day I Discovered Type Design

Anthropic's "Claude for Open Source" program still charged $200

Walmart wins patents to give algorithms more sway over prices

Bombarding gamblers with offers greatly increases betting and gambling harm

Open Source Gave Me Everything Until I Had Nothing Left to Give

The Stochastic Parrot Argument Considered Harmful

Black Cube: Israeli spy firm crashes Slovenia's election

Randevu: Deterministic Schelling Points for Decentralized Temporal Coordination [pdf]

SpaceX Knocks Boeing from Dominant Role in NASA's Moon Mission

Long dismissed in adult health, the thymus may be critical for longevity

You're probably overpaying for everything you buy online

Show HN: Groq Emulator

Forked Garry Tan's gstack and adapted for Google's Antigravity and Gemini-CLI

I Spoke to AI Agent Claude – Sen Bernie Sanders

ShouldIBuildThat finds app opportunities that appear across multiple signals

Building a UI Framework [pdf]

IdeaClaw – one sentence, get a camera-ready paper, BP, DD reports, health report

What's in a name? – The unknown faces of history

Making an Argument for (Voluntary) Online Identity Verification

To Catholic thinkers, Pentagon's AI demands violate 'human dignity'

I built a database scoring what separates high-scoring pitch decks from the rest

House speaker, Intel chiefs make new push to renew surveillance law

Replacing Anki: what I learned building a language app (1k users, $21 MRR)

Agent-rendered: the pattern that replaces runtime infra with build-time AI

Vulnerabilities in OpenClaw: A Complete Enterprise Security Analysis

Minecraft Source Code Is Interesting

AI Pentester

Update iOS to protect your iPhone from web attacks

New "PolyShell" flaw allows unauthenticated RCE on Magento e-stores

Generalized Dot-Product Attention: Tackling Real-World Challenges in GPU Kernels

Ask HN: Are we ready for vulnerabilities to be words instead of code?

Comments