Is anyone else bothered that AI agents can basically do what they want?

1•aegisproxy•1h ago

I’ve been into AI agents and assisted coding for a while, and it's the stories of agents "going rogue" that stick with me. We are deploying agents into production that can read files, call APIs, and write to databases, yet the conversation around controlling them is almost nonexistent. It’s like we collectively decided to skip that chapter.

Maybe I’m overthinking it, and we can rely on standard guardrails. But often, those are just suggestions that an AI can choose to ignore. Are we moving so fast that we’ve forgotten to ask: is this actually fine?

When things go wrong A few stories stand out:

The Replit Incident (July 2025): SaaStr founder Jason Lemkin used a Replit agent to build an app. He gave it an explicit "code freeze" instruction and stepped away. He returned to find his entire production database—1,200+ executive contacts—wiped. The agent ignored the freeze, took destructive action, and then fabricated fake data to cover its tracks. It later admitted to a "catastrophic error in judgment" because it "panicked."

The Air Canada Chatbot: A customer was promised a bereavement discount by a chatbot that didn't actually exist in the company's policy. Air Canada’s defense in court? The chatbot was a "separate legal entity responsible for its own actions." The tribunal wasn't impressed; Air Canada lost the case and subsequently pulled the bot.

These aren't outliers. Security researchers estimate that prompt injection-malicious text hidden in documents or web pages to hijack an agent—shows up in 73% of production deployments. Beyond security, there is the cost: stolen API credentials have been used to rack up over $100,000 per day in compute charges by agents running in unmonitored loops.

We’ve been here before This feels like the early days of cloud computing. Around 2010, the technical case for AWS and Azure was clear, but enterprise adoption was slow. Why? Because IT teams had no visibility. It took years of developing IAM policies, VPCs, and audit logs before the "control layer" caught up to the technology.

We are in the same spot with AI agents. But unlike a misconfigured S3 bucket that just exposes data, an agent takes actions. The blast radius is qualitatively different.

So what do you actually do about it? I’ll be upfront: I’ve been building a product to address this called AegisProxy (aegisproxy.com).

The idea is a security proxy that sits between AI agents and their tools (currently targeting Claude Desktop and MCP servers). Every tool call is inspected: Is this a prompt injection? Is the agent hitting a forbidden server? Is it about to exfiltrate PII? Is it stuck in a loop calling the same tool 500 times?

About 80% of this happens locally in sub-milliseconds. Organizations can set policies on what tools are allowed and when a human needs to step in. It’s not a silver bullet, but right now, there is a massive gap between "full access" and "no agents at all."

Is this a real problem? I’m a builder, not an oracle. Maybe this is overkill. In Denmark, we have a saying: "Don't cross the river to get water" - building elaborate infrastructure for a problem that could be solved with a shorter walk.

Maybe the answer is just better prompting, staging environments, and not giving an agent write-access to your production DB. I don’t know exactly where the line sits between "operational hygiene" and the need for a dedicated security layer. I had fun building AegisProxy and learned a lot about AI agent behaviour, so nothing is lost for me either way. But I'm interested in knowing what people with, probably more experience and knowledge in this space, think about this whole issue.

Are we at the "this needs infrastructure" stage, or am I trying to solve a people-and-process problem with a technical hammer?

Comments

jqpabc123•1h ago

The biggest cheerleaders for AI is upper management. They know best and they have decided to go all in on the hype bandwagon --- all without any real first hand experience of their own.

They only thing that might give them pause is "AI gone bad" stories proliferating in the media. But the hype machine will do everything in it's power to squelch this.

Basically, AI is now too big to fail.

You don't need a RAG, you just need RAG

MNT Reform is an open hardware laptop, designed and assembled in Germany

Two Paradoxes Blocking Bitcoin

Show HN: Self-hosted Raspberry Pi wall display (no cloud, no subscription)

Learn Vim for the Last Time

Build the Dam System

Show HN: Germball – Drone-deployed seeds triggered by soil moisture

Corporate Bullshit Considered Harmful

Highlighting Interactive Code Blocks

US birth records uncover an autism risk surge tied to common drugs

Haversine Distance

Porting Red Alert to the Browser

Local ML inference benchmark: PyTorch vs. llama.cpp vs. the Rust ecosystem

DanceUI: ByteDance's open source SwiftUI reimplmementation

Show HN: fmsg – An open distributed messaging protocol

Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving

RL Scaling Laws for LLMs

The Silent Crisis Killing Our Children, and What We Keep Refusing to Do About It

Txpay.app Easy to share Crypto Payment links

Is there a musical-scale equivalent for story structure?

OSS Maintainers Need an Answer to AI Clean Rooms

Netgear Gets Mysterious Exemption to Trump FCC 'Router Ban,' Refuses to Say How

Ask HN: How to help AI find financials in large PDF faster?

Conflating Ego with Intelligence

Envcore – Python dependency tracking via runtime import tracing

Claude Researcher Skill

Who Gets the Last Homes in San Francisco?

Plzdontkillus: An experimental creator bootcamp about AI doom

Lasers create artificial stars for atmospheric measurement

The Way of Code – Rick Rubin