The basic issue with this criticism of Cargo is that the fact that updating dependencies can cause breakage never actually goes away. The package management tool either makes this problem easy, or it makes it hard. The approach Cargo chooses is to make "cargo update" more likely to work by allowing library authors to specify incompatibilities. This can go wrong in edge cases. But that's inherent to the problem itself. Minimum version selection only seems to work because it doesn't actually solve the problem of keeping dependencies up to date. The moment you want to do that, you're stuck with "go get -u", which is just a worse version of "SAT solving" because it doesn't know how to detect and avoid incompatibilities.
(As an aside, I don't like the term "SAT solving" for the problem of package version selection. By focusing on an implementation detail, it makes the problem seem scarier than it actually is. Register allocation is NP hard too, but nobody calls the register allocator "the SAT solver".)
If I understand right, they support multiple incompatible versions being used at once via major version upgrades.
What they aren’t supporting is incompatible dependencies being incorrectly advertised as compatible, then being worked around by other dependencies testing against those versions and explicitly marking them as incompatible, in cases where dependencies have a shared dependency with different requirements.
Instead it looks like they are trying to solve this problem outside of the dependency manager, by putting a lot of emphasis on preserving backwards compatibility within major versions being the correct Go practice (and probably also hoping that edge cases get fixed quickly in a collaborative way, since everyone is on the same system).
How well this approach will work relatively in practice seems to depend more on the behavior of library authors than any technical factor, without knowing that it seems hard to say that either is obviously superior.
I'd describe it instead as that Cargo gives you a way out if your dependencies accidentally break semver, while Go's package manager doesn't. That may be a sensible decision on Go's part, but it doesn't solve the problem of dependency incompatibility. In fact, Go explicitly doesn't solve it.
That’s fair. Of course there’s a literal way out in that the root project can override whatever it wants, but there’s no way for library authors to automatically propagate that sort of fix.
So Cargo allows independent dependencies to solve the problem of broken semver in another dependency (as long as a viable version exists), where in Go it would need to be fixed in the original dependency or the root module.
I still think which approach “solves” the problem will depend on user behavior - it’s easy to imagine forcing broken semver to be fixed at the source working better in some cases than relying on incompatible version lists being updated correctly across many libraries. But Cargo certainly provides more tools to solve the problem.
Go solves it. The way out if things break is to use `go get` instead of `go get -u`, and then upgrade selectively.
Think of `go get` as stable and `go get -u` as unstable. If you're working on unstable, then things may break and you should report bugs and/or contribute patches.
Upgrading a library dependency in your module file and committing the change is how you change stable for your library. Downstream users have their own ideas of stable that will be behind yours, until they upgrade.
By "solves it" I mean "the system provides a tool that automatically upgrades all your packages to (1) their most recent versions without (2) breaking anything". Go get, without the -u, doesn't provide (1), since it only updates to minimum versions. Go get -u doesn't reliably provide (2), since there is no way to specify incompatibilities.
> (As an aside, I don't like the term "SAT solving" for the problem of
package version selection. By focusing on an implementation detail, it
makes the problem seem scarier than it actually is. Register allocation
is NP hard too, but nobody calls the register allocator "the SAT
solver".)
I don't quite get your criticism here. Russ's Minimal Version Selection
algorithm is a package version selection algorithm that does not require
SAT solving. Your wording sounds like Russ wrote something like “Package
version selection is inherently SAT, so we won't do that ever!”, but he
did exactly the opposite by introducing MVS.
Maybe you meant that you don't like the term for “Cargo-like package
version selection”, but that's a completely different story.
> I don't quite get your criticism here. Russ's Minimal Version Selection algorithm is a package version selection algorithm that does not require SAT solving.
That algorithm also doesn't handle updates though, which is quite a big limitation.
I think calling it a SAT solver is appropriate here. A lot of package managers were "organically grown" i.e. they were created in response to a problem that arose. Not much thought was given to the underlying problem and in the long term users suffer with e.g. dependencies taking forever to install and strange deadlocks. Recognizing it as a SAT problem from the start means we can easily throw Z3 or similar at it instead of manually adding heuristics everytime a user files a GitHub issue. (See Bundler and other pre-2018/2017 package management tools for example) Also, it's not always a SAT problem. For npm, Java Classloaders it's just tree traversal because conflicting sub dependencies of a dependency can co-exist.
That said, the Go's team (and to a lesser extent Python's too) dropped the ball here. None of the challenges of dependency management was unknown or unsolved. Anyone who has used e.g. Maven should be more than familiar. (A 11 section blog post is indeed nice though, good for onboarding new programmers I suppose) If developer experience was considered a first class priority, this could have been solved from day 0.
(My personal favorite is NPM since conflicting dependencies can co-exist at the expense of greater disk usage, but memory is cheap these days. No one with an electron app on their desktop has any right to complain about the memory usage of plain text source code.)
See, that's the reason why calling it a SAT solver makes the problem sound scarier than it is. Using Z3--or even MiniSAT--is massive overkill. The naive exponential algorithm is fine, and it leads to more maintainable and understandable code. Rust's build performance has nothing to do with dependency solving. I've never heard of a package manager in which core dependency solving was a speed problem; dep was slow because of I/O, not SAT solving.
> I've never heard of a package manager in which core dependency solving was a speed problem
Natalie Weizenbaum and I wrote the package manager for Dart. My naïve backtracking version solver was fine ~98% of the time. 2% of the time it was catastrophically, exponentially slow. Natalie wrote a much better one to solve exactly that:
If I correctly understood the post you linked, it looks like there is a big difference between Rust and Dart when it comes to dependency management: Rust allows for different major version of a crate to appear in the dependency tree, while it seems (from the first example given by the post) that Dart doesn't. That difference would dramatically change the difficulty of dependency resolution between the two languages, because there will be many more unsatisfiable cases in Dart than in Rust, leading to much more work to find a working set. (In the given example, cargo would just pick icon:1.0.0 and menu:1.5.0-> dropdown:2.3.0->icon:2.0.0, and call it a day).
The exponential search issue could arise if people used explicit upper bound on crate minor of patch versions, but since the number of crate doing that is few far between, there's little change to encounter such a degenerated exponential case in practice.
I/O + the absurd lengths you need to go to to (mostly) safely use Git as an external binary + no package repository to globally cache things like package names and versions. yea. "SAT solving" was an incredibly tiny amount of the runtime, like you'd expect for a CPU-bound algorithm.
It's a straw man that keeps getting dragged out, but anyone who has used and understood this kind of dependency manager knows it's meaningless. You can definitely argue about its impact on community behavior, but it functions just fine, handles a ton of real-world issues reasonably, and it's nowhere near slow enough to care about.
Define “fine”. In my experience, Go with modules[1] solves a project's
dependencies exponentially more quickly than dep. And that is
especially noticeable on big projects with three or more dozens of
dependencies. I don't mean any disrespect to the people behind dep, and
the whole debacle was a massive miscommunication disaster, but go mod is
just quicker.
[1] While we're at it, I'm still low-key mad that the Go authors called
modules packages and packages modules. Gah!
I complained about the term “modules” for the bigger code unit in Reddit, but RSC said they were both vague so it didn’t matter. I do still think it should have been “bundles” or something implying “bigger than a package” instead of “module”.
I don't get it. The number of dependencies is constant. Which means
that both of them need to make approximately the same amount of network
requests. Where does the I/O difference come from then?
Also, since you've mentioned Cargo. Its slowness actually used to be
one of the main pain points about which the Rust people at one of the
companies I've worked for in the past complained a lot during the water
cooler discussions. Granted, that was a couple of years ago, and Cargo
has probably made progress since then.
As I recall, dep had to parse Go code over and over during the dependency resolution process instead of caching it.
And any cargo slowness is likely the fault of rustc or I/O. It's received a lot of profiling work and the core dependency solver has never been a performance issue to my knowledge.
There are two things that really slow dep down. One is unavoidable; for the other, we have a plan.
The unavoidable part is the initial clone. ... Fortunately, this is just an initial clone - pay it once, and you're done.
The other part is the work of retrieving information about dependencies. There are three parts to this:
* Getting an up-to-date list of versions from the upstream source
* Reading the Gopkg.toml for a particular version out of the local cache
* Parsing the tree of packages for import statements at a particular version
The first requires one or more network calls; the second two usually mean something like a git checkout, and the third is a filesystem walk, plus loading and parsing .go files. All of these are expensive operations.
----------------
For context, in comparison, Cargo has the same #1 problem. But for #2, the information on all dependencies is stored in the index itself; this means that Cargo can figure out what dependencies you need without any network calls at all, let alone downloading, git checkout, and filesystem walk.
I do not know how go mod compares, off the top of my head.
The trend these days is tiny packages (similar to PaaS turning into Function-as-a-service), very fine grained libraries like Leftpad. The ultimate realisation of HN's favorite mantra "worship the Unix Philosophy". Expect the number of dependencies to only increase.
Except you have things like `sed` and `cut` and `cat`, not `lpd` which solely exists to add leftwise filler characters. Ergonomics of those tools nonwithstanding, they are simple but they aren't TOO simple.
Left-pad is IMHO "too simple". Packages on the median should be in a 1-2 pizza team effort with some ongoing maintenance, even if only a few pizzas a year. Left-pad is like, 2 pizzas and a weekend.
(were it not for string.prototype methods) a viable package would be something on the order of Python's str.format(). Can left pad and right pad!
Unix notoriously has the command `yes`, which provides an infinite loop of `y\n`.
Don't have a strong opinion on whether left-pad should exist or not (or indeed, if server-side Javascript should exist or not), but simpler than left-pad, is yes? yes.
Having a basic building block that enables a lot is different from making every tiny problem it's own package. Yes is a exception, certainly, but it doesn't disprove the rule.
Actually dep tended to produce suboptimal solutions, meaning in some cases it downloaded too many packages to solve dependencies. And that meant more I/O operations than needed. Plugging a full-blown SAT solver would have solved this issue.
This is due to the fact that not all libraries on PyPI have properly declared their metadata and, as such, they are not available via the PyPI JSON API. At this point, Poetry has no choice but downloading the packages and inspect them to get the necessary information. This is an expensive operation, both in bandwidth and time, which is why it seems this is a long process.
we compute a resolution graph that attempts to satisfy all of the dependencies you specify, before we build your environment. That means we have to start downloading, comparing, and often times even building packages to determine what your environment should ultimately look like, all before we've even begun the actual process of installing it (there are a lot of blog posts on why this is the case in python so I won't go into it more here).
------------------
As I stated in another comment on this thread, Cargo has to do none of this. This stuff is the source of things being slow, not the actual "Okay I have all the versions and constraints, what's the answer" bit.
I think the appropriate way to view this is not just as that this domain needs "SAT" (which enforces standardized algorithms as "the solution") so much as it describes an urgent need to popularize the concept of constraint satisfaction and optimization problems as a domain of coding that programmers actually deal with regularly and informally. Being able to sort, search and rank is the first step to having a backtracking solver, a technique that is very powerful and generalizable(albeit slow). But how many CS programs teach you to write solvers?
Our education methods could do a lot more to draw this technique into its applications. For example, in game programming, satisfaction tends to come up rather abruptly when talking about physics solvers; not because it's only needed then, but because the jargon was appropriated from academic simulation papers, and it comes as part of a sudden bomb of math concepts, making it look more intimidating than it is. Non-realistic physics are just called "collision code", handwaved away and not given a great deal of formal attention, even though they are performing the work of solvers too, just usually with a mix of linear programming and local search. As such, a lot of new game programmers get stuck on collision problems and concurrency bugs that result from poorly constrained solutions, because they don't have the conceptual tools surfaced to reason about it with confidence.
Likewise there are plenty of CRUD-style apps where a solver is a helpful building block for creating views and filters, but it's not approached as one, so you just get a buggy feature.
> A lot of package managers were "organically grown"
Many of the package managers we have now are thought out. Some replaced existing ones while others were crafted by people who had worked on dependency management in the past. Cargo and Composer are two examples of this.
I think you're right that `go get -u` could be improved on by having some system where the community shares known incompatibilities. It seems like this early-warning system could be built on top of Go modules, though? It might end up being more elegant than what we have, where anyone can report an incompatibility without either module author being involved at first.
Another improvement might be having a way to say "give me just the security fixes".
At first this could be a new tool that you run instead of `go get -u`, so the Go team doesn't have to be the ones to start this project.
People who just want their code to work don't need to care about this. They can run `go get`.
There are always going to be people who just want stable code and repeatable builds, and others who are working on migrating the community to the latest code. (And they might be the same person at different times.) It seems like Go's module system might work pretty well at helping them collaborate while staying out of each others' way?
The linux world solved this with providing a package repository that maintains stable versions of software. This requires a lot of time and effort.
Many software projects simply don't have the labor available to support multiple releases of software. They run tip of master and that's pretty much it.
Uh, I'm not sure why you're linking to that? Maybe I missed something. It doesn't seem to explain anything about library authors marking new versions as security fixes and querying for them, which is what I had in mind.
Why should I care about upgrading versions, though? From my perspective making some program P where I've decided to pull in a dependency on library L, I'm only interested in getting a new version of L if (1) it has some new feature I want or (2) there is a relevant security vulnerability. I don't see what is gained by trying to ask for newer versions of libraries than those I've specifically requested.
Or (3) it has fixed bugs (including ones we haven't hit yet) or (4) it has better compatibility with the surrounding ecosystem (which is not static).
More than once, the fix for a mysterious bug we encountered in production was just "upgrade a particular dependency to a newer minor release" (changing nothing on our code), since another user of that dependency had already encountered that same obscure bug, and together with the developers had already investigated it and found a fix. Had we been more diligent in keeping our dependencies up to date, even when they have no new features we want or security fixes, we would not have encountered that bug.
That’s very optimistic! In my experience, software doesn’t actually get less buggy with time, just differently-buggy, and sometimes even more buggy (which makes sense, if bugs go like complexity, and software gains complexity with time).
But in this case. Don't you already have an idea, through profiling or debugging, which package causes bugs or is slow? And if you do, figuring out if theres a newer version that improves your situation should be rather simple.
I've stopped blindly updating my dependencies, as people seem to have very different ideas what semver means. minor and patch upgrades in node and ruby programs have broken my applications several times.
Even when semver is done correctly, a breaking change in a package usually also means that you have to upgrade to the breaking change to get other goodies like improved performance and bug-fixes. This is something that takes a lot of time. At work we've basically stopped doing breaking changes in our private packages, opting instead in for a scheme similar to what you have to do in Go for everything to work.
Versioning is, at the end of the day, a cultural problem. And not having the tools to do breaking changes well can actually turn out well.
At least, that's my current thinking. I may very well be, and often am, wrong.
How do you know about security fixes (or even relevant bug fixes) in dependencies? How do you know about them in transitive rather than direct dependencies?
If you don't know to update how will you get the fixes?
By periodically looking at release notes for my dependencies, and, (if I'm being diligent or don't trust my dependencies to be doing audits of their own dependencies), transitive dependencies.
In general I never want to change any dependencies w/o explicitly thinking about the change and setting aside time for testing and breakage.
And at the time I'm adding some new direct dependency L, I will pick the version I want (presumably the latest). As for its dependencies, why should I want them to be the latest? I just want the versions that are most likely to work correctly w/ the version of L I have selected. Right?
I totally respect that. I also know that many (most) aren’t like that. There are a variety of reasons like some not having time. What can they do? The experience for those is often more time consuming than in the past. This decreases their productivity and happiness with the tooling.
> because it doesn't know how to detect and avoid incompatibilities
They reshaped the problem: all future revisions of a single entity must be backwards compatible because it's the "right thing to do", and thus, under this assumption, a much simpler solver will do the job.
Yes, this moves the problem slightly, since now there are other places that encode versioning information (the import path themselves) and thus some of the incompatibility detection checks performed by package managers are now delegated to the compiler (which will e.g. not find a given import, or complain that foo.Bar is not assignable to foo.Bar because the full import path has a different version tag).
Yes, many people do not currently care about the problem of backward compatibility and it will be hard in practice to make everybody raise their standards.
But, careful backward compatibility is one of the main ingredients that made Go successful in the first place.
A healthy ecosystem should follow the same rules.
Incidentally, this whole vgo kerfuffle can help highlight which section of the ecosystem is written without this rule in mind, offering a choice for both the authors of the components (improve their practices) and the consumers (e.g. switch to another library)
(As an aside, I don't like the term "SAT solving" for the problem of package version selection. By focusing on an implementation detail, it makes the problem seem scarier than it actually is. Register allocation is NP hard too, but nobody calls the register allocator "the SAT solver".)