history suggests otherwise
> The fact that 12 previously unknown vulnerabilities could still be found there, including issues dating back to 1998, suggests that manual review faces significant limits, even in mature, heavily audited codebases.
no, the code is simply beyond horrible to read, not to mention diabolically bad
if you've never tried it, have a go, but bring plenty of eyebleach
"I don't quite know how to put this, but our entire field is bad at what we do, and if you rely on us everyone will die"
"They say they've fixed it with something called <del>blockchain</del> AI"
"Bury it in the desert. Wear gloves"
If someone meant to engineer a codebase to hide subtle bugs which might be remotely exploitable, leak state, behave unexpectedly at runtime, or all of the above, the code would look like this.
https://youtu.be/WFMYeMNCcSY&t=1024
Teaser: "It's like throw a rock, you're gonna hit something... I pointed people in the wrong direction, and they still found a bug".
I wonder who could possibly be incentivized to make the cryptography package used by most of the worlds computers and communications networks full of subtly exploitable hard to find bugs. Surely everyone would want such a key piece of technology to be air tight and easy to debug
But also: surely a technology developed in a highly adversarial environment would be easy to maintain and keep understandable. You definitely would have no reason to play whackamole with random stuff as it arises
https://cryptography.io/en/latest/statements/state-of-openss...
Recently discussed: https://news.ycombinator.com/item?id=46624352
> Finally, taking an OpenSSL public API and attempting to trace the implementation to see how it is implemented has become an exercise in self-flagellation. Being able to read the source to understand how something works is important both as part of self-improvement in software engineering, but also because as sophisticated consumers there are inevitably things about how an implementation works that aren’t documented, and reading the source gives you ground truth. The number of indirect calls, optional paths, #ifdef, and other obstacles to comprehension is astounding. We cannot overstate the extent to which just reading the OpenSSL source code has become miserable — in a way that both wasn’t true previously, and isn’t true in LibreSSL, BoringSSL, or AWS-LC.
Also,
> OpenSSL’s CI is exceptionally flaky, and the OpenSSL project has grown to tolerate this flakiness, which masks serious bugs. OpenSSL 3.0.4 contained a critical buffer overflow in the RSA implementation on AVX-512-capable CPUs. This bug was actually caught by CI — but because the crash only occurred when the CI runner happened to have an AVX-512 CPU (not all did), the failures were apparently dismissed as flakiness. Three years later, the project still merges code with failing tests: the day we prepared our conference slides, five of ten recent commits had failing CI checks, and the day before we delivered the talk, every single commit had failing cross-compilation builds.
Even bugs caught by CI get ignored and end up in releases.
What’s eye bleachy about this beyond regular C/C++?
For context I’m fluent in C#/javascript/ruby and generally understand structs and pointers although not confident in writing performant code with them.
Part of OpenSSL's incomprehensibility is that it is not C++ and therefore lacks automatic memory management. Because it doesn't have built-in allocation and initialization, it is filled with BLAH_grunk_new and QVQ_hurrr_init. "new" and "init" semantics vary between modules because it's all ad hoc. Sometimes callees deallocate their arguments.
The only reason is needs module prefixes like BLAH and QVQ and DERP is that again it is not C++ and lacks namespaces. To readers, this is just visual noise. Sometimes a function has the same name with a different module, and compatible function signature, so it's possible to accidentally call the wrong one.
Would be interesting to see if any of those found exist there.
The methodology for developing and maintaining codebases like OpenSSL has changed!
> no, the code is simply beyond horrible to read, not to mention diabolically bad
OpenSSL? Parts of it definitely are, yes. It's better since they re-styled it. The old SSLeay code was truly truly awful.
As for all the slop the Curl team has been putting up with, I suppose a fool with a tool is still a fool.
It's also leading people to submit hallucinations as security vulns in open source. I've had to deal with some of them.
I suspect this year we are going to see a _lot_ more of this.
While it's good these bugs are being found and closed, the problem is two fold
1) It takes time to get the patches through distribution 2) the vast majority of projects are not well equipped to handle complex security bugs in a "reasonable" time frame.
2 is a killer. There's so much abandonware out there, either as full apps/servers or libraries. These can't ever really be patched. Previously these weren't really worth spending effort on - might have a few thousand targets of questionable value.
Now you can spin up potentially thousands of exploits against thousands of long tail services. In aggregate this is millions of targets.
And even if this case didn't exist it's going to be difficult to patch systems quickly enough. Imagine an adversary that can drip feed zero days against targets.
Not really sure how this can be solved. I guess you'd hope that the good guys can do some sort of mega patch against software quicker than bad actors.
But really as the npm debacle showed the industry is not in a good place when it comes to timely secure software delivery even without millions of potential new zero days flying around.
Let’s see them to do this on projects with a better historical track record.
Without Humans, AI does nothing. Currently, at least.
I checked the stack overflow that was marked High, and Fil-C prevents that one.
One of the out-of-bounds writes is also definitely prevented.
It's not clear if Fil-C protects you against all of the others (Fil-C won't prevent denial of service, and that's what some of these are; Fil-C also won't help you if you accidentally didn't encrypt something, which is what another one of these bugs is about).
The one about forgetting to encrypt some bytes is marked Low Severity because it's an API that they say you're unlikely to use. Seems kinda believable but also ....... terrifying? What if someone is calling the AESNI codepath directly for reasons?
Here's the data about that one:
"Issue summary: When using the low-level OCB API directly with AES-NI or other hardware-accelerated code paths, inputs whose length is not a multiple of 16 bytes can leave the final partial block unencrypted and unauthenticated.
Impact summary: The trailing 1-15 bytes of a message may be exposed in cleartext on encryption and are not covered by the authentication tag, allowing an attacker to read or tamper with those bytes without detection."
Better software is out there.
I like to recommend that project because it has a very transparent vulnerabilities approach, and is in my opinion written a lot more sane than OpenSSL which is somewhat not using standard C features because it always implements everything from scratch like a kernel does.
But yeah, anyways, WolfSSL comes from the embedded area in case that's your thing.
It doesn't look like they had 1 AI run for 20 minutes and then 30 humans sift through for weeks.
It does, though, look like they were running their AI over the codebase for an extended period of time (not per run, but multiple runs over the period of a year)
> Does it matter?
Hell yes, false reports are the bane of the bug bounty industry.
I think that it would be helpful from a research point of view to know what sort of noise their AI tool is generating, but, because they appear to be trying to sell the service, they don't want you to know how many dev months you will lose chasing issues that amount to nothing.
OpenSSL: Stack buffer overflow in CMS AuthEnvelopedData parsing
I was part of a body which funded work to include some stuff in the code, and the way you take something like X509 and incorperate a new ASN.1 structure inside the code, to be validated against conformance requirements (so not just signing blindly over the bitstream, but understanding the ASN.1 and validating it has certain properties about what it says, like not overlapping assertions of numeric ranges encoded in it) is to invoke callouts from deep down, to perform tasks and then return state. You basically seem to have to do about a 5 layer deep callout and return. It's a massive wedding cake of dependency on itself, it personifies the xkcd diagram of "...depends on <small thing>" risks.
I'm not surprised people continue to find flaws. I would like to understand if this approach also found flaws in e.g. libsodium or other more modern crytography, or in the OpenBSD maintained libreSSL code (or whatever it is) or Peter Gutmann's code.
OpenSSL is a large target.
> This doesn't mean that AI can replace human expertise. The OpenSSL maintainers' deep knowledge of the codebase was essential for validating findings and developing robust fixes. But it does change the SLA of security. When autonomous discovery is paired with responsible disclosure, it collapses the time-to-remediation for the entire ecosystem.
More evidence that "coding elegance" is irrelevant to a product's success, which bodes well for AI generated code.
dnw•1h ago
This sounds like a great approach. Kudos!