Show HN: AI pentester – verified exploits, $999/assessment

2•gauravbsinghal•1h ago

I spent 20 years in security, most recently leading 100+ engineers at AWS building pentesting infrastructure across thousands of services. The same problem everywhere: pentests take weeks, cost $15-50k, and the results are stale before they ship.

I built Cipher to fix that. It's an AI agent that reasons like an attacker — maps the target, finds vulnerabilities, chains them into exploits, and proves they're real. Every finding ships with a reproducible Python script. If the script doesn't break your system, we don't report it.

How it works: Cipher defines security invariants ("User A can't access User B's data"), then multiple agents attack in parallel to violate them. A separate judge agent tries to disprove every finding — if it can't reproduce the exploit 3 times, the finding dies. You never see it.

$999 per assessment. Results in ~2 hours. Unlimited retesting.

Honest limitations: complex multi-step auth flows (SSO with MFA) still need manual setup like providing JWT credentials. We're working on it.

I'll run Cipher free for the first 15 HN readers who want to try it. Drop your email or sign up at https://apxlabs.ai/. Happy to answer any questions about the approach.

Comments

tonetegeatinst•1h ago

Are you able to share what models or fine tuning you did for the agents?

I'm currently studying security in college, and most of my time is spent working on a good system card and premade prompts for certain situations like using nmap or burpsuite.

gauravbsinghal•1h ago

Great question. We use frontier models (Claude, Gemini class) without fine-tuning. The insight that changed everything for us: prompt engineering alone hits a ceiling fast for offensive security.

What matters more than the model:

1. Architecture over prompts. Cipher isn't one agent with a great prompt — it's multiple agents with distinct roles (recon, attack, verification) that coordinate. The "judge" agent that tries to disprove findings is more important than the attacker agent. 2. Tool use over reasoning. The model doesn't "know" how to pentest — it reasons about what tool to use next based on what it's learned so far. We give it real tools (not simulated ones) and let it chain them. 3. Invariant-based testing over checklist-based. Instead of "try SQLi on every input," Cipher defines security properties ("User A can't access User B's data") and tries to violate them. This catches logic bugs that no scanner finds.

Since you're studying security — the best thing you can do is get really good at manual pentesting first. Understanding why an attack chain works is what lets you build agents that reason about it. The prompts matter less than the mental model you encode into the system's architecture.

Happy to chat more — feel free to DM or join our Discord.

RageDetector – detects aggressive typing and forces me to calm down

Lentando Private Habit Tracker

Why Europe doesn't have a Tesla

SnkvDB – Single-header ACID KV store using SQLite's B-Tree engine

Software Engineering in 2026

Ask HN: Is the definition of AI that it can fool people?

Choose Your Fictions Well (2010)

Tell HN: Google Allegedly Sent NSFW "Grok" Notification to People

If AI Agents Do the Work, Who Pays for the Seat?

Agent Skills Hub – Security first directory for AI agent skills and MCP

Show HN: TerminalRant – Mastodon for developers who live in the terminal

Show HN: QuickStaging – AI virtual staging tool built by a 19yo student

From Claude Code to Figma

Wi-Fi 7's Best Feature Doesn't Work (Yet) [video]

Show HN: StatusPing – Uptime monitoring for $9/mo

Pidgin Plugins

How the Olympics Are Mixed Live 4k Miles Away [video]

Meshcore IRC Bridge

Copper-rs the deterministic OS for robotics gets full observability

Show HN: Verified 16.7M Mac chip architecture on $60 Android phone

Multi-Language MCP Server Performance Benchmark

Stop building generic AI chatbots: 45% of support leaders are ahead

A Local-Algebraic Route to Emergent Gravity (100 Pages)

Cultivating Praxia

Managing Docker Composes via GitOps

An update on upki

Google trying to recover footage from other Guthrie home cameras

Way to Understand the Irish Economy

Mature Cultural Desire

Technology has changed the world in my lifetime