There seems to be an ongoing trend (and my gut feeling) of companies moving from chatbots to AI agents that can actually execute actions—calling APIs, modifying databases, making purchases, etc. I'm curious: if you're running these in production, how are you handling the security layer beyond prompt injection defenses?
Questions:
- What stops your agent from executing unintended actions (deleting records, unauthorized transactions)? - Have you actually encountered a situation where an agent went rogue, and you lost money or data? - Are current tools (IAM policies, approval workflows, monitoring) enough, or is there a gap?
Trying to figure out if this is a real problem worth solving or if existing approaches are working fine.