"As of last week, this issue is now closed, and every single hash and HMAC algorithm exposed by default in Python is now provided by HACL*, the verified cryptographic library."
it's describing coverage.
Tangent: generally I’m more inclined to believe quality is improved when someone claims 1000s of lines reduced rather than 1000s of lines written.
But the problem may come from the headline, which is somewhat clickbaity. HN forbids changing it, and then part of the discussion focuses on the literal content of the headline, which is, as you rightly hint, not the best summary of what's interesting here.
In mathematics, the proofs tend to be resilient to minor errors and omissions, because the important part is that the ideas are correct.
In applied cryptography, errors and omissions are foundations for exploits.
Verifying that those 15,000 lines do what they do doesn't give me much more confidence than thorough unit testing would.
If this proof is formal, than it is not going to. That is why writing formal proofs is such a PITA, you actually have to specify every little detail, or it doesn't work at all.
> Verifying that those 15,000 lines do what they do doesn't give me much more confidence than thorough unit testing would.
It actually does. Programs written in statically typed languages (with reasonably strong type systems) empirically have less errors than the ones written in dynamically typed languages. Formal verification as done by F* is like static typing on (a lot of) steroids. And HACL has unit tests among other things.
But let's say at least that it's abused, exploited, perverted and a molested measurement before we call it a poor one. It not all that bad when you have context.
> All the code in this repository is released under an Apache 2.0 license. The generated C code from HACL* is also released under an MIT license.
You have an unusual definition of "depends on Microsoft". Anyone worried about depending on Microsoft should be able to maintain 15k lines of C that are already formally verified. Python already vendored the code so who cares who wrote that code?
Yup but this doesn't scale anywhere near as well for the attackers.
And these are just the attempts we know about. What makes you think that they haven't succeeded and we just don't know about it?
It's also very likely that even if they can attack crypto in ways we don't know about, they can at best reduce the required time it takes to crack a given key. Chosing extra long keys and changing them frequently should protect you from lots of attacks.
Why do you say that? Very often the vulnerabilities are not in the mathematics but in the implementation.
If Signal works, as was pointed out during the recent scandal, it only protects messages in transit. The application on your phone is not especially secure and anyone who can access your phone can access your Signal messages (and thus any of your correspondents' messages that they share with you).
> that's a pretty strong lead that it's very hard if not impossible to crack signal, at least for the Chinese.
The NSA does not recommend Signal for classified communication.
The NSA thinks use of Signal is the best interest of the US government, as the NSA perceives those interests (every institution will have its own bias). It could be that Signal is the least insecure of the easily available options or that that the NSA believes that only they can crack Signal.
I recommend this talk: https://www.youtube.com/watch?v=v8Pma5Bdvoo This gives you a good example how practical attacks and intentional weakening of crypto works.
And especially for common cryptos like AES and RAS you can easily compare the outputs of different implementations. If one is different, that one is suspect. And especially for open source crypto like OP, the implementation can easily be verified.
Device compromise is outside the threat model for Signal or any software for that matter.
> Why do you say that? Very often the vulnerabilities are not in the mathematics but in the implementation.
This is why we use primitives that are well understood and have straightforward implementations. I'm no expert but you can look at design of say Salsa20 or Curve25519 -- it was designed for straightforward constant time implementation. Yes NIST suite has implementation subtleties such as constant time implementations, but those ward off issues that are of pretty low concern / infeasible to attack (i've yet to see a timing channel or cache side channel be exploited in the wild except for the XBox 360 MAC verification). Also CFRG has been publishing guidance so we know how to properly implement KEMs, (hybrid) encryption, PAKE, hash to curve, signatures, ECDH, ECDSA, etc. Compound this with a lot of footgun free crypto libraries such as Go crypto, libsodium, monocypher, BoringSSL, Tink, etc. and these days you'd be hard pressed to make a misstep cryptography wise.
In my opinion, NSA advantage is not that it has a virtually unlimited budget, it's that they have better vantage point to carry out multi-user attacks. So does the likes of FAANG, Fastly, Cloudflare, etc.
I agree about Signal - that's what they say iirc. Some software does take it into account. The point here is about security depending on much more than cryptography mathematics; Signal is just an example.
Cryptographic libraries are not always documented as well as they should be. In the case of something like Python, it is not always easy to map library documentation to the resultant Python.
Yes, afaik they also have a history of not revealing exploits they discover. With a budget in the tens of billions and tens of thousands of employees, they have the resources to discover quite a bit.
I have observed two effects.
They are constantly trying to make it illegal for the common man to use cryptography. Constantly. Cryptography is subversive. It has the power to defeat judges, armies, nations. Authorities don't want normal people to have access to this. It will increasingly become a tool of corporations and governments.
Authorities are bypassing cryptography instead by attacking the end points. It's easier to crack other parts of the system and exfiltrate data after it's been decrypted than to break the cryptography. Easier to hack a phone and get the stored messages than to intercept end-to-end encrypted ciphertext and decrypt it.
Here's the (recently closed!) issue for pyca/cryptography regarding this feature, although I can't find an equivalent issue for the python stdlib: https://github.com/pyca/cryptography/issues/9185
Edit: Found a relevant issue, "closed as not planned": https://github.com/python/cpython/issues/82198
I worry a little when I read that things are verified and were hard.
And during the process they discovered that the original library did not handle allocation failures?
What is the big deal about Python here? It's just another wrapped C library.
It would be cool if Ruby did something similar for stdlib/openssl, but it could be also be done in a gem too.
I suppose you mean "just another top-notch library available in the Python ecosystem"? :)
This isn’t a language war.
One should be able to trust the proofs of correctness and never look at (or maintain) the generated C directly.
> Verified Low-Level Programming Embedded in F∗
> We present Low∗, a language for low-level programming and verification, and its application to high-assurance optimized cryptographic libraries. Low∗ is a shallow embedding of a small, sequential, well-behaved subset of C in F∗, a dependently-typed variant of ML aimed at program verification. Departing from ML, Low∗ does not involve any garbage collection or implicit heap allocation; instead, it has a structured memory model à la CompCert, and it provides the control required for writing efficient low-level security-critical code.
> By virtue of typing, any Low∗ program is memory safe. In addition, the programmer can make full use of the verification power of F∗ to write high-level specifications and verify the functional correctness of Low∗ code using a combination of SMT automation and sophisticated manual proofs. At extraction time, specifications and proofs are erased, and the remaining code enjoys a predictable translation to C. We prove that this translation preserves semantics and side-channel resistance.
After some digging, it looks like the answer is 3.14 [0], so we won't be seeing this until October [1].
One could argue that this is a security fix (just read the first sentence of the blog post) and should be included in all the currently supported versions (>=3.9) of Python [2].
[0] https://github.com/python/cpython/blob/main/Doc/whatsnew/3.1...
So, even if this comes out in Python 3.14, any non-trivial project will have to wait till Oct 2026 (or Oct 2027) to be able to use it.
So what we get is a a mini python2/3 situation on every single release instead.
Even patch version upgrade from 3.12.3 to 3.12.4 broke a lot of packages.
Sometimes I wish there was a GitHub with entry exam. "A library you use has a bug, you find a find a 3 month old bug report for your exact issue. Do you 1) add a comment with "me too", 2) express your frustration this issue hasn't been fixed yet, 3) demand the maintainers fix it as soon as possible as it's causing issues for you, or 4) do nothing".
Only half joking.
Also, you can upvote issues as well / leave reactions without leaving comments. It ensures a better signal:noise ratio in the actual discussion.
There's some bozo asking "any news? I cant downgrade because another lib requirement" just two days after the maintainer wrote several paragraphs explaining how difficult it is to make it work with Python 3.13. This adds no value for anyone and is just noise. Anyone interested in actual useful information (workarounds, pointers on how to help) has to wade though a firehose of useless nonsense to get at anything remotely useful. Any seriously discussions of maintainers wanting to discuss things is constantly interrupted by the seagulls from Finding Nemo: "fix? fix? fix? fix? fix?""
Never mind the demanding nature of some people in that thread.
Just upvote. That's why this entire feature was added.
They will not move to ~~mocking up~~ sketching a wireframe of something.
https://github.blog/news-insights/product-news/add-reactions...
However the barrier to entry for a lot of issues reporting is pretty tall - requiring triplicate documentation of everything. It's like an overreaction to the sort of people that need to be told to unplug something for 30 seconds to continue the support call.
And that's if they even "allow" issues.
I was posting in issues note frequently 1-3 years ago. I'm sure I'd be sheepish about some posts.
5) Do 1, 2, and 3
6) Submit a a tested pull request with a fix
Truth is developers don't want to end up maintaining other people's code anyways. It's gotten to the point I roll my eyes whenever I see people saying "patches welcome". Chances are they're not as welcome as you would think. "Show us the code", and when you do... Nothing. Submitting a pull request is no guarantee that they'll accept it. Hell it's no guarantee they'll even give you the time of day. You can put in a whole lot of effort and still end up getting refused or even straight up ignored.
Even if it's an unambiguous improvement on the status quo. They'll commit code that literally returns "not implemented" directly to master and actually ship it out to users. Your pull request will certainly be held to much higher standards though. And that's if they engage with you in an attempt to get your patches merged. They might just decide to ignore your patch and redo all the work themselves, which is exactly what all the complainers wanted them to do in the first place! If they're really nice they'll even credit you in the commit message!
If you're gonna put in effort into someone else's project, just fork it straight up and make it your own project. Don't waste time with upstream. Don't wait for their blessing. Just start making your own version better for you. Don't even submit it back, they can pull it themselves if they want to.
> Nobody ever said no to genuine high quality bug reports that were coupled with good fixes.
Uhuh.
> ask how receptive they are
Doesn't really help much. It's entirely possible for people to tell you it's fine then change their minds after you actually do it and send the code in. Seriously hope you never have to experience anything of the sort.
Always fork the project and customize it for yourself. Don't argue with others, don't try to convince them, make your version work better for you. It doesn't matter if they want it or not. You don't even have to publish it.
I do agree with the "fork and make it better for you", but I also think it's common courtesy to throw up issues for the bugs or missing features you encounter, as well as pull requests. If for no other reason then as "prototypes" or "quick fixes" that others can cherry-pick into their own forks. They may get rejected, they may get closed, but you still don't have to sit there arguing about it since you have your own version. From a slightly Kantian angle you have your working version and you've now fulfilled your social duty by offering solutions to problems. You've got no need to campaign for the solutions if they get rejected.
It's virtuous and you get to ignore the github flame wars. There's really no downside beyond taking 5 minutes to be nice and at least put your solution out there. Also fulfills your reciprocal legal duty under LGPL and such.
That way, people with the same issue who come in to the issue thread through a search aren't reduced to combing through all the forks to see if one happens to fix the issue. Then instead of spamming up the thread with "me too, this is soo frustrating, fix asap pls", they can spam it up with "merge this fork into the main project asap pls" :-)
I'm using the license exactly as intended. Upstream developers literally don't matter. Free software is not about developers, it's about users.
Free software licenses say the user gets the source code. They're about empowering the user, in this case me, with the ability use and modify the software as I see fit. If I customize something for my personal use, then I'm the only user of the fork and license terms are fulfilled. People would only get to demand source code from me if I started distributing compiled executables to other users.
> You've got no need to campaign for the solutions if they get rejected.
I used to feel that need. Caused me significant anxiety. I thought upstreaming patches was the Right Thing to do. Mercifully I've been cured of this misconception.
> There's really no downside beyond taking 5 minutes to be nice and at least put your solution out there.
I have no problem with that. My point is getting involved with upstream is often more trouble than it's worth. Let them do their thing. Just pull from them and rebase your changes. Enjoy life.
People should think twice before trying to cram their unsolicited code down other people's throats. Even when they ask for the code, chances are deep down they don't actually want to deal with it.
I once tried to retrofit a library system into GNU bash of all things. Let's just say it didn't go well.
Also, the same ideas applied to dating rather than to code are what we call "incels/redpillers" and so on, so you might want to calm down a bit I think.
Relevant to whom? Everything I've done is extremely relevant to me. I wouldn't have done those things if I didn't care about them.
> Also, the same ideas applied to dating rather than to code are what we call "incels/redpillers" and so on
I'd say it's more like MGTOW.
I've certainly made people angry when I rejected their buggy "improvements".
They haven't.
I see your point though:
> it could be that they weren't as good as you think they were
Who knows. Maybe. Asked myself that question many times. Asked others.
It doesn't matter anymore. Now I only wonder if they are good enough for me.
[1] https://github.com/refined-github/refined-github [2] https://github-production-user-asset-6210df.s3.amazonaws.com...
This seems to be the issue https://github.com/explosion/spaCy/issues/13658#issuecomment...
And you depend on opinionated libraries that break with newer versions. Why? Well because f you that's why! Because our library is not just a tool, it's a lifestyle
Though it seems that Pydantic 1x does support 3.13 https://docs.pydantic.dev/1.10/changelog/#v11020-2025-01-07
To be honest I know so many people who use Pydantic and so many people who seem to get stuck because of Pydantic 2. I’m glad I have minimal exposure to that lib, personally.
I suppose the biggest issue really is type annotation usage by libs being intractable
I dislike using something like docker or conda as I don't want to install/use a separate distro just to be able to use a different version of Python. My setup is working well so far.
spaCy should make Cython optional
hard fork Cython to not used stringitized annotations
stay on Python 3.12 forever and then skip to 3.15
It is like have a crowd of people trying to outdo each other on how much self harm they can induce.
See: Lemmings[0]
[0] https://en.wikipedia.org/wiki/Lemmings_(National_Lampoon)
In no world is this an “own goal”. God forbid they take on a big task for the betterment of the future language.
God forbid.
Backward incompatible changes are an own goal because they either (depending on your view) make the software worse, or make it better and then make those improvements unavailable to users.
How?
> As evidenced by the recent SHA3 buffer overflow, cryptographic primitives are tricky to implement correctly. There might be issues with memory management, exceeding lengths, incorrect buffer management, or worse, incorrect implementations in corner cases.
This is a proactive fix for zero days that may be lurking in the wild.
The fact that you often won't find out until runtime makes it a real pain
It's a mess.
(then the hardware would have to be validated as well if complex ISA).
The point is, cryptography should be mostly if not all assembly. Look at dav1d AV1 decoder to have an idea.
hall0ween•2w ago
otterley•2w ago
Provable correctness is important in many different application contexts, especially cryptography. The benefit for Python (and for any other user of the library in question) is fewer bugs and thus greater security assurance.
If you look at the original bug that spawned this work (https://github.com/python/cpython/issues/99108), the poster reported:
"""As evidenced by the recent SHA3 buffer overflow, cryptographic primitives are tricky to implement correctly. There might be issues with memory management, exceeding lengths, incorrect buffer management, or worse, incorrect implementations in corner cases.
The HACL* project https://github.com/hacl-star/hacl-star provides verified implementations of cryptographic primitives. These implementations are mathematically shown to be:
jessekv•2w ago
https://github.com/python/cpython/issues/99108#issuecomment-...
aseipp•2w ago
odo1242•2w ago
aseipp•2w ago
(You could also make the argument that if previous implementations that were replaced were insecure that their relative performance is ultimately irrelevant, but I'm ignoring that since it hinges on a known unknown.)
drewcoo•2w ago
aseipp•2w ago
protz•2w ago
bear in mind that performance is tricky -- see my comment about NEON https://github.com/python/cpython/pull/119316#issuecomment-2...
kccqzy•2w ago
Retr0id•2w ago
Also, pyca/cryptography uses OpenSSL. OpenSSL is fine, but has the same "problem" as the previous python stdlib implementation. (Personally I think it's an acceptable risk. If anything, swapping in 15,000 lines of new code is the greater risk, even if it's "verified")
frumplestlatz•2w ago
Retr0id•2w ago
hathawsh•2w ago