For well-behaving/stable/consistent setups I fully agree though, go.mod is both sufficient and better, and those other cases can probably just key off both instead. I think I've seen go.mod to change without go.sum changes (change an unused transitive dependency into a direct dependency), which can lead to your build needing something that wasn't cached because it was pruned in the previous version.
Which it sounds like they have (planned), which seems like a good improvement.
iirc (I do not have a test setup at the moment to verify) it does affect your dependency resolution, and therefore your build, though its code does not exist in your binary. I know this because those AWS / K8S / Google Cloud libraries cause MASSIVE problems with their constant breaking changes without major version changes, and their importing other libraries that also have frequent breaking changes without major version changes, even if the dependency those unused subpackages include (and therefore raise the minimum version) is only used by some other module that needs a lower version (iirc). It's quite a headache sometimes, and could be rather easily solved if you could set upper bounds and not just lower. Or if those giga-projects would stop doing such obviously bad things.
The version-affecting behavior is kinda unavoidable afaict. If they didn't include those unused version constraints, `go build ...` or just importing a new package within existing modules could cause your build to fail, forcing you to rerun version resolution. That'd probably just lead people to feel like "go is broken UGH", and go leans awfully hard towards avoiding that kind of thing. Mostly for the better, but not quite always / not in all ways.
For example, take the S3 library, github.com/aws/aws-sdk-go-v2/service/s3. If you have an s3.Client and you look at e.g. the ListObjectsV2 method, you might have no idea that there is a ListObjectsV2Paginator which makes it much easier to use, because nowhere in the method docs is it mentioned. Indeed, most operations that paginate have more ergonomic paginators, but none of them tell you this.
But that isn't even the worst of it. Say you want to download or upload a file to S3. If you haven't worked with AWS for other languages, you might think that you just do GetObject and PutObject. And yes, for small files, that's generally fine. But for large files you want to use resumable downloads and multipart uploads. So you look and lo, there is no simple way to do this in the AWS SDK for Go. But actually, there is! It's in a totally unrelated and unlinked package, called github.com/aws/aws-sdk-go-v2/feature/s3/manager.
Now you're getting some religion, so you ask "what are the other so-called 'feature' packages?" and you try to browse pkg.go.dev at the github.com/aws/aws-sdk-go-v2/feature level but nope, that's not a Go module so there's nothing to see there. In fact, you can't even browse the other s3 features, never mind find out what other services have features. Fortunately, you can browse their GitHub repo at least: https://github.com/aws/aws-sdk-go-v2/tree/main/feature
It's quite clear that they use poorly thought-out cross-language codegen for this, which partly explains the lack of ergonomics, but also shows that they don't much care whether you use their stuff properly.
AWS seems to optimise for an SDK that be completely generated but not for an SDK that tells you what you want to know.
Maybe that was a transitory stage and it’s straightened out now. I certainly hope so.
Not to say it is a bad project. Not at all.
I'd personally consider myself comfortable in C/C++. I've built Wayland compositors, H264 backends for live-streaming, and built Chromium occasionally for testing. Despite being a die-hard Firefox (zen) user - I still have not been able to compile Firefox! To be fair, this was pre-Firefox quantum days so I hope their SVM and build tools have improved.
Go modules relies too heavily on dependencies being good citizens, which is a very naive approach to dependency management.
[1] Before you ask, I'm not reading the full diff on something like x/sys. Mostly on third-party dependencies where I find it harder to judge the reliability of the maintainers.
https://github.com/FiloSottile/mostly-harmless/tree/main/dep...
The example.com/mod2 go.mod does not in fact affect version resolution, because it's not even fetched. However, it affects the example.com/mod1 go.mod, and the example.com/mod1 go.mod affects version resolution.
This doesn't help with the problem you are describing, but it still has value from a security point of view, because example.com/mod2 truly doesn't matter except to the extent that was already checked into example.com/mod1, which you do need to trust.
If you try to "go build" or "go test" something in example.com/mod2, you actually do get an error since Go 1.17, as if it was not in your dependency tree at all. You need to "go get" it like any new dependency.
No, it does not. Minimum version selection means that the libraries will at least be that version, but it could be substituted for a later version if a transient dependency asks for such.
That I'm reading this blog post at all suggests there is a "market" for a single checksum/version manifest, which data is currently housed in go.sum . This is sad, but, Hyrum's Law and all that.
My understanding is that the point of a lockflle is that you don't need to do that.
No?
All dependencies - direct and indirect - are listed in your go.mod. Your module - as is - depends on nothing else. And those exact versions will be used to build it, if yours is the main module.
If your module is used as a dependency of another module, then yes, your module may be built with a newer version of those dependencies. But that version will be listed in that module's go.mod.
There's no way to use different versions without them being listed in some go.mod.
go.sum to only maps between versions and hashes, and may contain hashes for multiple versions of modules.
If you wanted to verify the contents of a dependency, you would want to check go.sum. That's what it is there for, after all. So if you wanted to fetch the dependencies, then you would want to use it to verify hashes.
If all you care about the is the versions of dependencies, you really can (and should) trust go.mod alone. You can do this because there are multiple overlapping mechanisms that all ensure that a tag is immutable once it is used:
- The Go CLI tools will of course use the go.sum file to validate that a given published version of a module can never change (at least since it was depended on, but it also is complementary with the features below as well, so it can be even better than that.)
- Let's say you rm the go.sum file. That's OK. They also default to using the Go Sum DB to verify that a given published version of a module can never change. So if a module has ever been `go get`'d by a client with the Sum DB enabled and it's publicly accessible, then it should be added to the Sum DB, and future changes to tags will cause it to be rejected.
- And even then, the module proxy is used by default too, so as soon as a published version is used by anyone, it will wind up in the proxy as long as its under a suitable license. Which means that even if you go and overwrite a tag, almost nobody will ever actually see this.
The downside is obviously all of this centralized infrastructure that is depended on, but I think it winds up being the best tradeoff; none of it is a hard dependency, even for the "dependencies should be immutable" aspect thanks to go.sum files. Instead it mostly helps dependency resolution remain fast and reproducible. Most language ecosystems have a hard dependency on centralized infrastructure, whether it is a centralized package manager service like NPM or a centralized repository on GitHub, whereas the centralized infrastructure with Go is strictly complementary and you can even use alternative instances if you want.
But digression aside, because of that, you can trust the version numbers in go.mod.
You're right, but also TFA says "There is truly no use case for ever parsing it outside of cmd/go". Since cmd/go verifies the contents of your dependencies, the point generally stands. If you don't trust cmd/go to verify a dependency, then you have a valid exception to the rule.
I know the current attitude is to just blindly trust 3rd party libraries (current and all future versions) and all of their dependencies, but I just can't accept that. This is just unsustainable.
I guess I'm old or something.
Only that if the corresponding version does get used, and the hash doesn't match, you get an error. But you can have multiple versions of the same dep in your go.sum - or none at all - and this has no bearing on what version gets picked when you build your module.
The version that does get picked is the one in go.mod of the main module, period; go.sum, if it exists, assists hash verification.
Yes, if you want a lockfile in the npm sense, you need both.
But a Go module does not get built with new transitive dependencies (as was claimed) unless they're listed in some go.mod; go.sum is irrelevant for that.
It doesn't happen only later at build time.
For example:
- `go get x@v1.0.0` => Your go.mod contains `x v1.0.0`
- `go get y@v1.0.0` with y having x v1.0.1 as dep => Your go.mod is already updated with the resolved minimum selected version: `x v1.0.1`
This requires using Go commands to manage the go.mod file. If you edit it in a text editor then a final `go mod tidy` will help.
You run that when you've made manual changes (to go.mod or to your Go code), or when you want to slim down your go.sum to the bare minimum needed for the current go.mod.
And that's one common way to update a dependency: you can edit your go.mod manually. But there are also commands to update dependencies one by one.
Which means if you wanted to update one version, it might bump up the requirements on its dependencies, and that's all the changes you see from running go mod tidy afterwards.
Manually constructing an inconsistent dependency graph will not work.
As explained in the post, if a transitive dependency asks for a later version than you have in go.mod, that’s an error if -mod is readonly (the default for non-get non-tidy commands).
I encourage you to experiment with it!
This is exactly how the “stricter” commands of other package managers work with lockfiles.
If go.sum has "no observable effect on builds", you don't know what you're building and go can download and run unverified code.
I'm not a go developer and must be misunderstanding something...
I think it's coz not EVERY language's lockfile comes with checksum
So, Go's go.mod is functionally equivalent Ruby Gem lockfile (that doesn't have checksum) but need to get go.sum to be equivalent to npm's (that does come with checksum)
Author just compared it to languages where lockfile means just version lock
Thanks for that link.
Based on reading through that whole discussion there just now and my understanding of the different ecosystems, my conclusion is that certainly people there are telling Filippo Valsorda that he is misunderstanding how things work in other languages, but then AFAICT Filippo or others chime in to explain how he is in fact not misunderstanding.
This subthread to me was a seemingly prototypical exchange there:
https://lobste.rs/s/exv2eq/go_sum_is_not_lockfile#c_d26oq4
Someone in that subthread tells Filippo (FiloSottile) that he is misunderstanding cargo behavior, but Filippo then reiterates which behavior he is talking about (add vs. install), Filippo does a simple test to illustrate his point, and some others seem to agree that he is correct in what he originally said.
That said, YMMV, and that overall discussion does certainly seem to have some confusion and people seemingly talking past each other (e.g., some people mixing up "dependents" vs. "dependencies", etc.).
I don't get this impression. Rather, as you say, I get the impression that people are talking past each other, a property which also extends to the author, and the overall failure to reach a mutual understanding of terms only contributes to muddying the waters all around. Here's a direct example that's still in the OP:
"The lockfile (e.g. uv.lock, package-lock.json, Cargo.lock) is a relatively recent innovation in some ecosystems, and it lists the actual versions used in the most recent build. It is not really human-readable, and is ignored by dependents, allowing the rapid spread of supply-chain attacks."
At the end there, what the author is talking about has nothing to do with lockfiles specifically, let alone when they are applied or ignored, but rather to do with the difference between minimum-version selection (which Go uses) and max-compatible-version selection.
Here's another one:
"In other ecosystems, package resolution time going down below 1s is celebrated"
This is repeating the mistaken claims that Russ Cox made years ago when he designed Go's current packaging system. Package resolution in e.g. Cargo is almost too fast to measure, even on large dependency trees.
From my understanding
its stored forever in the proxy cache and your new tag will never be fetched by users who go through the language's centralized infrastructure (i.e. proxy).
go can also validate the checksums (go.sum) against the languages central infrastructure that associates version->checksums.
i.e. if you cut a release, realize you made a mistake and try to fix it quitely, no user will ever see it if even one user saw the previous version (and that one user is probably you, as you probably fetched it through the proxy to see the mistake)
This is mistaken. The Go module proxy doesn't make any guarantee that it will permanently store the checksum for any given module. From the outside, we would expect that their policy is to only ever delete checksums for modules that haven't been fetched in a long time. But in general, you should not base your security model on the notion that these checksums are stored permanently.
Incorrect. Checksums are stored forever, in a Merkle Tree, meaning if the proxy were to ever delete a checksum, it would be detected (and yes, people like me are checking - https://sourcespotter.com/sumdb).
Like any code host, the proxy does not guarantee that the code for a module will be available forever, since code may have to be removed for legal reasons.
But you absolutely can rely on the checksum being preserved and thus you can be sure you'll never be given different code for a particular version.
For the question “is the data in the checksum database immutable” you can trust people like the parent, who double checks what Google is doing.
For the question “is it the same data that can be downloaded directly from the repos” you can skip the proxy to download dependencies, then do it again with the proxy, and compare.
So I'd say you don't need to trust Google at all in this case.
If Google were to present you with a different view of the Merkle Tree with different checksums in it, they'd have to forever show you, and only you, that view. If they accidentally show someone else that view, or show you the real view, the go command would detect it. This will eventually be strengthened further with witnessing[2], which will ensure that everyone's view of the log is the same. In the meantime, you / your coworker can upload your view of the log (in $GOPATH/pkg/sumdb/sum.golang.org/latest) to Source Spotter and it will tell you if it's consistent with its view:
$ curl --data-binary "@$(go env GOPATH)/pkg/sumdb/sum.golang.org/latest" https://gossip.api.sourcespotter.com/sum.golang.org
consistent: this STH is consistent with other STHs that we've seen from sum.golang.org
[1] https://research.swtch.com/tlogHow long the google-run module cache (aka, module proxy or module mirror) at https://proxy.golang.org caches the contents of modules is I think slightly nuanced.
That page includes:
> Whenever possible, the mirror aims to cache content in order to avoid breaking builds for people that depend on your package
But that page also discusses how modules might need to be removed for legal reasons or if a module does not have a known Open Source license:
> proxy.golang.org does not save all modules forever. There are a number of reasons for this, but one reason is if proxy.golang.org is not able to detect a suitable license. In this case, only a temporarily cached copy of the module will be made available, and may become unavailable if it is removed from the original source and becomes outdated.
If interested, there's a good overview of how it all works in one of the older official announcement blog posts (in particular, the "Module Index", "Module Authentication", "Module Mirrors" sections there):
While go.mod does not allow for explicit version ranges, the versions given are the minimum versions. In other words, the versions given are the lower bound of the compatibility range.
Go also strictly follows semantic versioning. Thus the implicit exclusive upper bound is the next major version. This assumes that all minor and patch releases are backwards compatibile and not breaking.
Dependency resolution in Go uses minimum version selection. That means the minimum requirements of all dependencies are evaluated and highest minimums are selected. In principle, this minimum version selection should be time invariant since the oldest versions of the compatible dependencies are used
While the minimum versions specified in go.mod are not necessarily the version of the dependencies used, they can be resolved to the versions used irrespective of time or later versions of dependencies being released.
Other languages do not use minimum version selection. Their package resolution often tries to retrieve the latest compatible dependency. Thus a lock file is needed.
Python packages in particular do not follow semantic versioning. Thus ranges are critical in a pyproject.toml.
In summary, the "manifests" files that the original author describes are configuration files. In some languages, or more accurately their package management schemes, they can also be lock files, true manifests, due to version semantics. If those semantics are absent, then lock files are necessary for compatibility.
This has not been true since Go 1.17 with the default -mod=readonly, which is why go.mod is a reliable lockfile.
and checking in lockfile changes
anishgupta•1d ago