Glad to see that they brought in humans to validate and patch vulnerabilities. Although, I really wish they linked to the actual patches. Here's what I could find:
Yeah, having a layer of human experts to sanity check and weed out hallucinated false positive issues seems like an important part of this process:
> To ensure that Claude hadn’t hallucinated bugs (i.e., invented problems that don’t exist, a problem that increasingly is placing an undue burden on open source developers), we validated every bug extensively before reporting it. [...] for our initial round of findings, our own security researchers validated each vulnerability and wrote patches by hand. As the volume of findings grew, we brought in external (human) security researchers to help with validation and patch development.
Based on the experiences shared by curl's maintainers over the last year, I'd suggest the "growing risk of LLM-discovered [security issues]" is primarily maintainers being buried under a deluge of low-effort LLM-hallucinated false positive security issue reports, where the reporter copy-pastes LLM output without validation.
tznoer•1h ago
Grepping for strcat() is at the "forefront of cybersecurity"? The other one that applied a GitHub comment to a different location does not look too difficult either.
Everything that comes out of Anthropic is just noise but their marketing team is unparalleled.
octoberfranklin•16m ago
This reads like an advertisement for Anthropic, not a technical article.
cyanydeez•7m ago
Is there a polymarket on the first billion dollar AI company to 0$ by their own insecure Model deployment?
samfundev•9h ago
https://cgit.ghostscript.com/cgi-bin/cgit.cgi/ghostpdl.git/c...
https://github.com/OpenSC/OpenSC/pull/3554
https://github.com/dloebl/cgif/pull/84
shoo•4m ago
> To ensure that Claude hadn’t hallucinated bugs (i.e., invented problems that don’t exist, a problem that increasingly is placing an undue burden on open source developers), we validated every bug extensively before reporting it. [...] for our initial round of findings, our own security researchers validated each vulnerability and wrote patches by hand. As the volume of findings grew, we brought in external (human) security researchers to help with validation and patch development.
Based on the experiences shared by curl's maintainers over the last year, I'd suggest the "growing risk of LLM-discovered [security issues]" is primarily maintainers being buried under a deluge of low-effort LLM-hallucinated false positive security issue reports, where the reporter copy-pastes LLM output without validation.