We have been developing our own system (1) for several years, and it's built by engineers, not Claude. Take a look — maybe it could be helpful for your case.
No one knows you also caused that problem.
Fascinating! Our team has been blending static code analysis and AI for a while and think it's a clever approach for the security use case the Anthropic team's targeting here.
I don't yet have access to Claude Code Security, but I think that line of reasoning misses the point. Maybe even the real benefit.
Just like architectural thinking is still important when developing software with AI, creative security assessments will probably always be a key component of security evaluation.
But you don't need highly paid security engineers to tell you that you forgot to sanitize input, or you're using a vulnerable component, or to identify any of the myriad issues we currently use "dumb" scanners for.
My hope is that tools like this can help automate away the "busywork" of security. We'll see how well it really works.
The impact question is really around scale; a few weeks ago Anthropic claimed 500 "high-severity" vulnerabilities discovered by Opus 4.6 (https://red.anthropic.com/2026/zero-days/). There's been some skepticism about whether they are truly high severity, but it's a much larger number than what BigSleep found (~20) and Aardvark hasn't released public numbers.
As someone who founded a company in the space (Semgrep), I really appreciated that the DARPA AIxCC competition required players using LLMs for vulnerability discovery to disclose $cost/vuln and the confusion matrix of false positives along with it. It's clear that LLMs are super valuable for vulnerability discovery, but without that information it's difficult to know which foundation model is really leading.
What we've found is that giving LLM security agents access to good tools (Semgrep, CodeQL, etc.) makes them significantly better esp. when it comes to false positives. We think the future is more "virtual security engineer" agents using tools with humans acting as the appsec manager. Would be very interested to hear from other people on HN who have been trying this approach!
100% agree - I spun out an internal tool I've been using to close the loop with website audits (more focus on website sec + perf + seo etc. rather than appsec) in agents and the results so far have been remarkable:
Human written rules with an agent step that dynamically updates config to squash false positives (with human verification) and find new issues in a loop.
upghost•1h ago
Padme: You're scanning for vulnerabilities so you can fix them, Anakin?
Anakin: ...
Padme: You're scanning for vulnerabilities so you can FIX THEM, right, Annie?
czbond•1h ago
I hope Anthropic can place alerts for their team to look for accounts with abnormal usage pre-emptively.
tptacek•1h ago
czbond•1h ago
This is different than someones "npm audit" suggesting issues with packages in a build and updating to new revisions. Also different than iterating deeply on source code for a project (eg: nginx web server).
tptacek•1h ago
ukuina•11m ago
nikcub•2m ago