frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Is anyone else bothered that AI agents can basically do what they want?

1•aegisproxy•1h ago
I’ve been into AI agents and assisted coding for a while, and it's the stories of agents "going rogue" that stick with me. We are deploying agents into production that can read files, call APIs, and write to databases, yet the conversation around controlling them is almost nonexistent. It’s like we collectively decided to skip that chapter.

Maybe I’m overthinking it, and we can rely on standard guardrails. But often, those are just suggestions that an AI can choose to ignore. Are we moving so fast that we’ve forgotten to ask: is this actually fine?

When things go wrong A few stories stand out:

The Replit Incident (July 2025): SaaStr founder Jason Lemkin used a Replit agent to build an app. He gave it an explicit "code freeze" instruction and stepped away. He returned to find his entire production database—1,200+ executive contacts—wiped. The agent ignored the freeze, took destructive action, and then fabricated fake data to cover its tracks. It later admitted to a "catastrophic error in judgment" because it "panicked."

The Air Canada Chatbot: A customer was promised a bereavement discount by a chatbot that didn't actually exist in the company's policy. Air Canada’s defense in court? The chatbot was a "separate legal entity responsible for its own actions." The tribunal wasn't impressed; Air Canada lost the case and subsequently pulled the bot.

These aren't outliers. Security researchers estimate that prompt injection-malicious text hidden in documents or web pages to hijack an agent—shows up in 73% of production deployments. Beyond security, there is the cost: stolen API credentials have been used to rack up over $100,000 per day in compute charges by agents running in unmonitored loops.

We’ve been here before This feels like the early days of cloud computing. Around 2010, the technical case for AWS and Azure was clear, but enterprise adoption was slow. Why? Because IT teams had no visibility. It took years of developing IAM policies, VPCs, and audit logs before the "control layer" caught up to the technology.

We are in the same spot with AI agents. But unlike a misconfigured S3 bucket that just exposes data, an agent takes actions. The blast radius is qualitatively different.

So what do you actually do about it? I’ll be upfront: I’ve been building a product to address this called AegisProxy (aegisproxy.com).

The idea is a security proxy that sits between AI agents and their tools (currently targeting Claude Desktop and MCP servers). Every tool call is inspected: Is this a prompt injection? Is the agent hitting a forbidden server? Is it about to exfiltrate PII? Is it stuck in a loop calling the same tool 500 times?

About 80% of this happens locally in sub-milliseconds. Organizations can set policies on what tools are allowed and when a human needs to step in. It’s not a silver bullet, but right now, there is a massive gap between "full access" and "no agents at all."

Is this a real problem? I’m a builder, not an oracle. Maybe this is overkill. In Denmark, we have a saying: "Don't cross the river to get water" - building elaborate infrastructure for a problem that could be solved with a shorter walk.

Maybe the answer is just better prompting, staging environments, and not giving an agent write-access to your production DB. I don’t know exactly where the line sits between "operational hygiene" and the need for a dedicated security layer. I had fun building AegisProxy and learned a lot about AI agent behaviour, so nothing is lost for me either way. But I'm interested in knowing what people with, probably more experience and knowledge in this space, think about this whole issue.

Are we at the "this needs infrastructure" stage, or am I trying to solve a people-and-process problem with a technical hammer?

Comments

jqpabc123•1h ago
The biggest cheerleaders for AI is upper management. They know best and they have decided to go all in on the hype bandwagon --- all without any real first hand experience of their own.

They only thing that might give them pause is "AI gone bad" stories proliferating in the media. But the hype machine will do everything in it's power to squelch this.

Basically, AI is now too big to fail.

You don't need a RAG, you just need RAG

https://enopdf.com/blog/rag-for-pdfs/
1•gcassie•27s ago•0 comments

MNT Reform is an open hardware laptop, designed and assembled in Germany

http://mnt.stanleylieber.com/reform/
1•speckx•40s ago•0 comments

Two Paradoxes Blocking Bitcoin

https://alignmenteconomy.org
1•moeman245•1m ago•1 comments

Show HN: Self-hosted Raspberry Pi wall display (no cloud, no subscription)

https://github.com/silentg33k/chalkboard-installer
1•g_33_k•2m ago•1 comments

Learn Vim for the Last Time

https://danielmiessler.com/blog/vim
1•dolfmaggot•5m ago•0 comments

Build the Dam System

https://jsfour.substack.com/p/build-the-dam-system
1•js4•6m ago•0 comments

Show HN: Germball – Drone-deployed seeds triggered by soil moisture

https://www.indiegogo.com/en/projects/germball/germball--smart-seed-capsule
1•lukascodes•6m ago•0 comments

Corporate Bullshit Considered Harmful

https://chuniversiteit.nl/papers/corporate-bullshit
1•ibobev•7m ago•0 comments

Highlighting Interactive Code Blocks

https://www.redblobgames.com/blog/2026-04-16-highlighting-interactive-code-blocks/
1•ibobev•7m ago•0 comments

US birth records uncover an autism risk surge tied to common drugs

https://medicalxpress.com/news/2026-04-millions-birth-uncover-autism-surge.html
1•OutOfHere•7m ago•0 comments

Haversine Distance

https://www.4rknova.com//blog/2026/04/19/haversine-distance
1•ibobev•7m ago•0 comments

Porting Red Alert to the Browser

https://lab.rosebud.ai/engineering/porting-openra-to-the-browser
1•vladgl94•8m ago•0 comments

Local ML inference benchmark: PyTorch vs. llama.cpp vs. the Rust ecosystem

http://kvark.github.io/ai/performance/2026/04/19/tales-from-the-inference-land.html
1•kvark•8m ago•1 comments

DanceUI: ByteDance's open source SwiftUI reimplmementation

https://github.com/bytedance/DanceUI
1•CharlesW•8m ago•0 comments

Show HN: fmsg – An open distributed messaging protocol

https://markmnl.github.io/fmsg/show-hn.html
1•markmnl•8m ago•0 comments

Qwen3.6-Max-Preview: Smarter, Sharper, Still Evolving

https://qwen.ai/blog?id=qwen3.6-max-preview
3•mfiguiere•9m ago•0 comments

RL Scaling Laws for LLMs

https://cameronrwolfe.substack.com/p/rl-scaling-laws
2•Brajeshwar•12m ago•0 comments

The Silent Crisis Killing Our Children, and What We Keep Refusing to Do About It

https://comuniq.xyz/post?t=973
1•01-_-•12m ago•0 comments

Txpay.app Easy to share Crypto Payment links

https://txpay.app/
1•maximoCorrea•12m ago•0 comments

Is there a musical-scale equivalent for story structure?

https://blog.quanten.co/is-there-a-musical-scale-equivalent-for-story-structure-clustering-screen...
1•phaedrus044•12m ago•0 comments

OSS Maintainers Need an Answer to AI Clean Rooms

https://12gramsofcarbon.com/p/open-source-maintainers-need-an-answer
1•theahura•13m ago•1 comments

Netgear Gets Mysterious Exemption to Trump FCC 'Router Ban,' Refuses to Say How

https://www.techdirt.com/2026/04/20/netgear-gets-mysterious-exemption-to-trump-fcc-router-ban-ref...
1•cdrnsf•14m ago•1 comments

Ask HN: How to help AI find financials in large PDF faster?

1•richardwong1•15m ago•0 comments

Conflating Ego with Intelligence

https://artagnon.com/art/ego
1•artagnon•16m ago•0 comments

Envcore – Python dependency tracking via runtime import tracing

https://github.com/JanBremec/envcore
1•janbr•17m ago•0 comments

Claude Researcher Skill

https://github.com/maher-naija-pro/claude-researcher
1•mahernaija•18m ago•0 comments

Who Gets the Last Homes in San Francisco?

https://datastream.substack.com/p/who-gets-the-last-homes-in-san-francisco
1•racketracer•19m ago•0 comments

Plzdontkillus: An experimental creator bootcamp about AI doom

https://www.plzdontkillus.com/
1•olalonde•19m ago•0 comments

Lasers create artificial stars for atmospheric measurement

https://www.eso.org/public/images/potw2616a/
2•orzi•20m ago•0 comments

The Way of Code – Rick Rubin

https://www.thewayofcode.com/#1
2•rootforce•20m ago•0 comments