frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Anthropic's open-source framework for AI-powered vulnerability discovery

https://github.com/anthropics/defending-code-reference-harness
114•binyu•1h ago

Comments

lanyard-textile•1h ago
>This repo is not maintained and is not accepting contributions.

Hm :)

spacebacon•59m ago
This one is and should be adapted to every frozen model ASAP.

https://github.com/space-bacon/SRT

Significantly improve every frozen model overnight. LFG.

Hamuko•28m ago
Why isn't Claude maintaining it?
skeledrew•10m ago
They pretty much saying the efficacy of the tool can be tested by anyone to determine if it's worth purchasing the more polished and up-to-date commercial offering.
trilogic•1h ago
https://github.com/Mainframework/Anthropic-Cybersecurity-Ski...

Be aware: the .py/s will not pass the antivirus but basically they do the job.

bigmattystyles•1h ago
I wonder how this sort of product is going over at Coverity and others like it. Proper SAST vendors I mean. Is it an existential threat?
rms2ds•9m ago
If I had to guess, they'l eventually just add it into their own product and hike the prices up to cover tokens lol.
simonw•1h ago
I wonder how much this thing costs to run.

https://github.com/anthropics/defending-code-reference-harne... says:

> As a rough guideline, expect ~10K uncached input tokens/min and ~2K output tokens/min per agent. You can scale parallelism up to your account's ITPM limit (roughly 10 agents per 100K ITPM).

My guess would be hundreds of dollars with Opus and thousands of dollars with Mythos.

Analemma_•1h ago
I mean, you don't need to run it all the time, right? You do it once over your entire existing codebase to start and then once over the diff in your CI/CD pipeline when you make a new change. I'm sure it's not literally that simple but I doubt these need to churn 24/7/365 either.
xerxes249•1h ago
In the Mythos blogpost they revealed to run the model like a 1000 times on the same code-base maybe with slightly different prompt or temperature. That suggests it will just be pay to win. If the 'attacker' spends more money/tokens than the 'defender' you will eventually be outclassed.
jazz9k•1h ago
Companies don't make production pushes yearly. For many, it's two week sprints..and that's one project.

This doesn't make any sense cost-wise. It would be cheaper to just hire a security engineer.

vb-8448•55m ago
richardbarosky•1h ago
To be sure, security is an amazing AI/LLM use case. A huge swath of the work is pattern matching known security issues against stuff that's very precise to analyze -- programming language text.

Something that stands out is that for the strongest use cases, AI companies will prefer to sell the technique as a service rather than its raw output. For use cases where the output is less valuable, tokens are sold. If AI tokens were so magical in creating new value in developing software applications generally, they wouldn't be selling tokens directly. They'd hoard the tokens are use them to dominate SaaS software in any industry they want.

The same way as someone selling an expensive course in the stock market is signaling that they have more to gain by selling the course rather than taking their knowledge and making money in the stock market directly.

dgellow•52m ago
> The same way as someone selling an expensive course in the stock market is signaling that they have more to gain by selling the course rather than

Or they want to diversify

> If AI tokens were so magical in creating new value in developing software applications generally, they wouldn't be selling tokens directly.

That requires to build and sell a whole product they have little experience with, competing with their own customers. Not a great place for an AI vendor still trying to establish itself. It’s a lot of distraction, when you already have a lot to deal with the existing business. And strategically not too valuable

skybrian•46m ago
Maybe, but an alternative argument that building an ecosystem is more valuable in the long run.

We started out with many companies forbidding their employees to use remote LLMs on their source code because of security concerns. Now many companies are starting to believe that they must analyze their all their source code with remote LLMs because of security concerns. When trusting Anthropic becomes normalized, that means they can sell more services that require access to the source code.

tptacek•43m ago
The thing about things like this is that they're shop jigs. You can buy a crosscut sled if you really want to, but most woodworkers just make their own.

It was a different situation 2 years ago, when there was significant cost to building your own harness (but then: you probably weren't doing AI vuln research 2 years ago). Today, I think your best bet is to look at something like this for ideas, and then just ask for your own, to fit your own work style, with your own interface, your own notion of target and effort specification, and your own alerting.

redfloatplane•9m ago
"Shop jigs" is a great way to put it. I think a lot of software has gone from being made for general use to extremely individualised use. Before the Age of AI, it took so much human effort to write something that solved your problem that you might often go the extra mile so that others could re-use it. Now, it takes almost no effort, so the software stays ungeneralised. Some of the incentive has changed, I think. Most of the time I no longer share the things I've been building[0] because, for one thing they simply couldn't possibly have any benefit for others, and if they need something like it, they can build exactly the thing they want instead of having to extend or modify my thing. Like a jig!

0: https://redfloatplane.lol/blog/17-why-share/ (and related posts, I guess)

bartoszcki•30m ago
> Anthropic engineers on average ship 8x as much code per quarter

Are they making 8x more features or the same amount just with more code?

extr•27m ago
Interesting it's in python!
zoobab•19m ago
Open source crap to connect to an LLM blob.
zoobab•18m ago
'open source' crap to connect to their LLM blob.
wslh•8m ago
Looking forward to trying this tomorrow (it's late here). Has anyone run it on a real codebase yet? Curious about setup friction, cost, and signal/noise.
You are supposed to run it on full codebase before any single PR gets merge.
nikcub•1h ago
It's becoming apparent that it requires more tokens to secure code than it does to write it

May even be an order of magnitude more

Mtinie•56m ago
In all seriousness, wasn’t that always the case? Writing bad code is relatively cheap.

Ensuring code isn’t bad is the expensive part.

bflesch•45m ago
It's weird because why can't they train the AI to simply output secure code?

The basic security flaws with regards to input validation and overflows should never ever be output by an AI. For "security flaws due to bad design" I'll cut them slack until AGI is achieved.

simonw•23m ago
> It's weird because why can't they train the AI to simply output secure code?

The most interesting security bugs have causes that are spread across large codebases, or networks of dependencies.

Training the AI to "output secure code" won't work if it doesn't also have access to the source code of every dependency that it's using... and even then, given current model speeds and prices most developers won't want to wait for an hour on every edit they make while the LLM reasons through all of the dependencies.

tptacek•41m ago
For now, maybe, yes? But the most important targets of this kind of work aren't AI outputs; it's legacy code, particularly (but not exclusively) old memory-unsafe code. In those situations the figure of merit isn't the token cost of recreating the target code; it's the cost of finding the same bugs with humans or preexisting tools.

Those costs can be extremely high.

binyu•7m ago
Claude workflows in ultra code mode works in a very similar fashion and it consumes a moderate amount of the session usage, depending on the complexity of the task. With the API it would probably get expensive quickly though
energy123•44m ago
They can only do that if they're a monopoly, which they're not
DrewADesign•39m ago
> They can only do that if they're a monopoly, which they're not

Why do you say that? I reckon lots and lots of companies sell software that aren’t monopolies. Having competition, even stiff competition, isn’t anathema to running a business.

energy123•32m ago
You said "They wouldn't be selling tokens directly ... They'd hoard them"

But they can't do that because they aren't monopolies.

Kiro•30m ago
> They'd hoard the tokens are use them to dominate SaaS software in any industry they want.

I don't understand this argument. I've ran and sold a semi-successful SaaS. The exhausting and frustrating parts are all the things an LLM cannot help you with. Coding the product is not the bottleneck or what grants you success.

richardbarosky•20m ago
> Coding the product is not the bottleneck or what grants you success.

Agree, and I think that's the core of my point.

Not that it's irrational or doesn't make sense to sell tokens for purposes of software dev, but that if tokens were a true game changer for success in software dev, they wouldn't be leading with token sales, the same way they're not leading with token sales for security stuff -- looks like it's all about Claude Security(TM).

zuzululu•12m ago
Good point but I do think LLM helps with those frustrating parts while not being able to outright solve them.
Melatonic•21m ago
Surprised we havent gotten an integrated "MetaSploit" AI update where it calls and messages a ton of people in a company and once it starts to find someone possibly vulnerable lets a human red teamer take over or guide it more by hand.
hyperpape•13m ago
> If AI tokens were so magical in creating new value in developing software applications generally, they wouldn't be selling tokens directly. They'd hoard the tokens are use them to dominate SaaS software in any industry they want.

This doesn't follow at all. Anthropic's revenue is growing 10x year over year selling tokens. Their tokens can be super magical, let them enter established industries and displace incumbents, and get 100% annual growth in those industries, and they would still be better off prioritizing selling tokens, because it's a great business.

What your argument shows is that there are limits. Their tokens are not quite powerful enough to make infinite money instantly in every area of software. Admittedly, that does seem true.

Show HN: Intencion – Product analytics that improves your AI agents continuously

https://intencion.io
1•sakuraiben•48s ago•0 comments

The push to bring AI doctors into American medicine

https://www.washingtonpost.com/technology/2026/06/04/inside-trump-backed-push-bring-ai-doctors-in...
1•bookofjoe•1m ago•1 comments

Millennials Are Set to Inherit Trillions–But for Most, It Will Come Too Late

https://www.realtor.com/news/trends/great-wealth-transfer-inheritance-too-late-millennials/
1•littlexsparkee•3m ago•0 comments

XML and JSON in 2026

https://www.tbray.org/ongoing/When/202x/2026/06/01/XML-and-JSON-in-2026
4•throw0101a•3m ago•0 comments

How to fight back against Gen-Z socialism

https://economist.com/leaders/2026/06/04/how-to-fight-back-against-gen-z-socialism
3•andsoitis•3m ago•0 comments

A faster bump allocator for rust

https://owen.cafe/posts/stumpalo/
2•414owen•4m ago•0 comments

The Next Generation of Heroku Postgres (Postgres Advanced)

https://www.heroku.com/blog/introducing-the-next-generation-of-heroku-postgres/
2•robotfelix•5m ago•1 comments

Being privacy-conscious comes with some downsides

1•wqtz•5m ago•0 comments

AI Agent Craves Curation. Here's the Fademem Memory Architecture

https://medium.com/@vektormemory/your-ai-agent-craves-curation-heres-the-memory-architecture-that...
1•vektormemory•5m ago•0 comments

Statewide retail theft operations lead to 32K arrests, $260M in recovered items

https://ktla.com/news/california/california-organized-retail-theft-data-october-2023/
2•Bender•6m ago•0 comments

Against the Singularity Hypothesis (2024)

https://link.springer.com/article/10.1007/s11098-024-02143-5
2•Gooblebrai•9m ago•0 comments

Getjobber-CLI – Python CLI for the Jobber GraphQL API

https://github.com/acaracappa/getjobber-cli
1•antman1911•9m ago•0 comments

Queen bees emerge from special wax chambers

https://cen.acs.org/materials/biobased-materials/queen-bees-special-wax/104/web/2026/06
2•gmays•10m ago•0 comments

Overnight military drill loud flash bangs simulated gunfire upsets residents

https://ktla.com/news/local-news/pasadena-late-night-military-training-exercise/
1•Bender•10m ago•0 comments

2026 security assessment of our Android app

https://mullvad.net/en/blog/2026-security-assessment-of-our-android-app
2•Cider9986•12m ago•0 comments

May Job Cuts Rise 16% from April; Highest May Total Since 2020

https://www.challengergray.com/blog/challenger-report-may-job-cuts-rise-16-from-april-highest-may...
2•mooreds•12m ago•0 comments

What is AI psychosis is the product?

https://gregoryap.substack.com/p/what-if-ai-psychosis-is-the-product
4•gphil•13m ago•0 comments

Why AI Agents Need Agile, Not Just Better Prompts

https://medium.com/open-ai/why-ai-agents-need-agile-not-just-better-prompts-31aac90b1f4a
2•sukhpinder0804•20m ago•0 comments

Show HN: SheetMog – OSS Excel alternative and headless SDK

https://github.com/fundamental-research-labs/mog
2•zdenham•21m ago•0 comments

Let's log off and head outside

https://www.nytimes.com/interactive/2026/06/04/well/touch-grass-challenge-week-1.html
4•donohoe•22m ago•0 comments

From Zero to Senior: How I grew in my career [video]

https://www.youtube.com/watch?v=uTWXwCzfK78
2•indigodaddy•23m ago•0 comments

Agentlocks – advisory file locks for Codex and Claude Code in one worktree

https://github.com/simke9445/agentlocks
2•simke9445•23m ago•0 comments

Meet The Agents at USV: Arthur, Ellie, Sally, and Friends

https://blog.usv.com/meet-the-agents
2•rmason•25m ago•1 comments

Governments escalate the global war on online anonymity

https://www.wsws.org/en/articles/2026/05/25/pgvy-m25.html
2•iamnothere•25m ago•0 comments

Fidelity lowers SpaceX IPO entry requirement from $500,000 to just $2,000

https://finance.yahoo.com/markets/stocks/articles/fidelity-cuts-spacex-ipo-eligibility-183319186....
7•tcp_handshaker•26m ago•0 comments

Show HN: Mercek – A Desktop IDE for AWS ECS

https://www.mercek.dev/
2•utibeumanah•26m ago•0 comments

They don't make them like this anymore

https://www.knut.fyi/blog/2026-06-03/cascadiajs-2026
2•jeffinpdx•27m ago•0 comments

My transcendental experience on Japan's art island guided by its master Lee Ufan

https://www.theguardian.com/artanddesign/2026/may/04/infinity-transcendental-japan-naoshima-art-i...
3•PaulHoule•28m ago•0 comments

LHCb experiment observes long-sought new particle with double charm

https://www.nikhef.nl/en/news/lhcb-experiment-observes-long-sought-new-particle-with-double-charm/
4•elashri•33m ago•0 comments

AI will consume as much water in 2030 as 1.3B people

https://english.elpais.com/technology/2026-06-03/ai-will-consume-as-much-water-in-2030-as-13-bill...
8•dnnddidiej•34m ago•1 comments