Binary cache closure querying #605
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#605
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This is an RFC/invitation for discussion, I don't have a strongly preferred solution at this point.
Is your feature request related to a problem? Please describe.
Querying a large closure on a binary cache can take a while, since one request is performed for each path that's not already present locally. The lower limit on the time taken for this with the current binary cache protocol and querying behaviour is
request latency * graph depth
.Describe the solution you'd like
We can extend the binary cache protocol with an optional way to get the narinfos for a path's whole closure in one go.
Describe alternatives you've considered
Since we're aware of all of a path's potential dependencies in advance, we could query the narinfos for all the output paths in its closure before we know if the output we want actually depends on them. This results in far more requests than my proposed approach, but could provide similar performance improvements.
A new, incompatible binary cache protocol that's generally smarter (e.g. allowing querying multiple paths in a single request, or even providing paths that are already available locally so we don't get any redundant info). This might be worth doing at some point anyway, but for now I think a "quick-fix" to what we already have can provide great value.
Additional context
I've implemented a hacky prototype version of what I suggest.
This extends binary caches with optional
<hash>.closure.narinfo
files, which contain arbitrary narinfos separated by blank lines. Lix will fetch these and store all the obtained narinfos in the narinfo cache. This allows short-circuiting requests for any dependencies that were provided eagerly by the server.https://gerrit.lix.systems/c/lix/+/2324 -- the client side
https://git.lix.systems/lheckemann/closure-cache/src/branch/main/closure-cache.ts -- the server side (everything hardcoded to work for my own deployment). When
<hash>.closure.narinfo
is requested, it will read the narinfos for the whole dependency closure and serve them. While I've implemented it as a dynamic server, these closure files could just as well be precomputed and served as static files.This improves querying performance on a chunky closure significantly:

It also preserves both backwards and forwards compatibility: the client can still talk to binary caches that don't support this feature, and binary caches that do support it can still be used by clients that don't.
Thoughts:
Should support for this protocol be indicated as part of
nix-cache-info
? We can halve the number of requests made to caches that don't support it if we do so.Does accepting narinfos for any requested path have security implications I haven't thought of? All I can think of is that this makes it easier for a malicious cache to serve a different version of a dependency depending on which dependent a client asked for, but signatures are still checked as usual.
Does it make sense not to specify which paths should actually be included? I called it "closure", but the server could really serve whatever narinfos it wants, e.g. putting a limit on the number of narinfos served at once or doing fancy heuristics based on access patterns to "guess" which paths clients are likely to already have.
Does this introduce significant new consistency concerns? I could definitely see things getting weird if paths within a fully static-file-based binary cache are ever modified, but I'm not sure this is much worse than the status quo.
Would we want to take this opportunity to replace the custom narinfo format (for this use case) with a better-specified one where broad agreement on how to parse it already exists, like JSON or one of the more efficient binary encodings of JSON's shape?
that's what we're currently working towards. we already have http2 support with all the pipelining goodness it supports, the limiting factor for using that properly is only that the legacy code we're working with is absolute spaghetti. once we've distangled this we can make much better use of the WantMassQuery bit to blast stores with queries for all path infos in a closure at once, without introducing more hacks that merely work around fundamental design deficiencies instead of actually fixing them.
That sounds good!
Does that completely obsolete the idea of having the server provide data more eagerly though? Making fewer requests seems desirable even if we're rid of the graph-depth-roundtrips problem (though that's purely vibes-based and I might be completely off there), and the server is in a better position to know what's necessary (because it has the dependency metadata at hand) than the client.
it does not, but the current api (as you've noticed) does not give us any good methods to provide more data. we may want to do this at some point, but it needs a more thorough redesign of the binary cache protocols which right now are an unholy mix of half store, half http wrapper. providing extra data is only really useful in the non-http case, and we can improve those so much more with a proper extensible rpc protocol that doesn't have to pretend it's http under the hood somewhere
How so?
I find the fact that the current binary cache protocol can be implemented by serving static files is pretty valuable. Is that something we don't care about preserving / improving on?
extra data like closure info is potentially very expensive to save as static files in the http case, or impossible to compute statically at all. dynamically generated data on http stores is mostly not realistic since our primary http store remotes are s3 buckets.
we absolutely do! but it's hard to add extensions to only one set of binary caches without making things much harder for ourselves in the future. the cache protocol/api is already very strange, attempting to mimic being a store (to the extent where it's possible to build directly into an http cache!). we'd be much better off if we dropped this masquerade and started treating caches as distinct entities, with distinct capabilities. once we have that it'd be much easier to add additional things on top without complicating maintenance of the existing infrastructure.
I'll be playing around with this a bit to get a proper feel for the costs, but I think the closure path being optional leaves a lot of space for optimising around expected access patterns (a simple approach being to provide them for top-level outputs of CI builds, but not for anything deeper in the graph) that wouldn't bring the explosion of space usage that having them for everything would.
How?
That seemed fairly sensible to me...
...wait wtf
Sounds reasonable. Any idea when that might happen? Because I'd kind of like to be able to make use of this "soon" and am wondering if it would be reasonable to get this (but with reasonable parsing code, not the abomination I linked) into Lix with the knowledge that it will be thrown out again later. I don't want to make any demands here though, I'm fine with looking into other approaches like implementing it as a plugin or as a proxy that exposes the "one-narinfo-at-a-time" API based on the closure-serving one. :)
Another thing that is possibly of interest here is that currently the majority of traffic to binary caches is misses. So some ideas like racing caches against each other rather than querying sequentially would help a lot but they might also bankrupt cachix and cache.nixos.org in s3 api fees. So we should probably be careful about more aggressively querying, especially if CppNix is to take our improvements wholesale as they have a habit of doing.
The cache backend for these huge caches really needs to be something better than just s3. edef has ideas and maybe even drafts but is the burned out creature she is.
Also re your strategy: have you considered http2 push? This actually sounds possibly like it could be http2 push, to be standards compliant.
Edit: okay so to elaborate: we have a narinfo cache. we could accept http2 server pushes of paths into our narinfo response cache for the current host if we haven't seen the narinfo yet. in effect this would allow the server to drive the entire closure of a path into lix all at once, with zero protocol or semantic changes, just a smarter http server.
Would it really result in more queries for c.n.o? I thought it would be the same amount of queries (edit: since c.n.o is the highest-priority cache for most users), just over a shorter timespan, and I don't think that's relevant for costs? As for cachix, is this something you've heard from Domen or speculation? Because if it's the latter I think it would makes a lot of sense to ask him, especially because he would probably also be interested in making caches noticeably faster and (potentially) needing fewer requests.
That does sound elegant, but as you say does require a smarter server, unlike the static closure narinfo approach, making it much more complex to deploy.
I am 100% certain that adding racing of binary caches would cause twice as many requests to be sent in instances where you have both caches configured and in which c.n.o. (e.g.) returns a positive answer before the second cache is queried at all. More aggressive optimistic querying of dependencies of derivations before said derivations themselves would definitely lead to a larger number of requests in many instances as well since there's requests in flight which might get their results discarded after other requests in flight complete.
As for whether sending more requests is a problem, that is a question to be asked of people's architectures and AWS bills. But this is the type of thing for which we do need to ask the people running the caches.
Sending requests faster is a different question and imo is okay especially if there's want-mass-query set.
Of course everyone has a vested interest in making this stuff faster, it sucks. I am relatively of the opinion that any binary cache operating at scale really should be smarter than an s3 bucket. Also, #578 is something that makes me reticent to do more complexity in the binary cache creation code of Lix: it's already full of things that are definitely not our responsibility yet doesn't do enough things as it is. This isn't to say we shouldn't just ship experiments, but s3 binary caches really suck for numerous reasons like the extreme impracticality of gc without a database and imo we shouldn't consider these the preferred type of cache anymore; serious implementations of caches should have some server side intelligence and we could consider fully-static caches to be a less preferred option.
can we consider this done now that closure queries are not rtt-limited, only by how many concurrent requests the cache will accept?
It doesn't look like that's the case to me. Full details follow...
I'm running
nix path-info --store https://cache.sphalerite.tech -r $systemClosure
after removing~/.cache/nix/binary-cache-v6.sqlite*
, and it's not any quicker than when I wrote this issue.Version info:
I also tried adding
-vv --log-format internal-json
and if I'm interpreting the log correctly, it's never even fetching more than one narinfo concurrently, even though the cache hasWantMassQuery: 1
. Maybe I have the wrong expectations regarding how it's logged, or I'm analysing the output correctly, but here's what I'm doing:activities.jq
:Invoked with:
The idea of this script is to produce a list of lists of concurrently running downloads; each of the lists only contains a single entry though. A few random manual samples of the log also suggest the same thing: each narinfo download doesn't start until the previous one has finished.
hmm, then more work still needs to be done.
queryMissing
as done during build setup should be much better now, but of course there would be other things that are busted 🫠Can confirm that
nix build --dry-run
instead ofnix path-info
queries with much more concurrency, consistently fetching up to ~500 narinfos at once in a few tests and taking 1.5s rather than 2min. Great work, and I imagine this will affect real-world use much more than the performance ofnix path-info
:)If I'm reading current
queryMissing
and related code correctly, it doesn't yet query build dependencies before knowing whether they're needed, so there's still rtt limitation there, right? (e.g. queryhello
, and only once we have a negative result there will we query forstdenv
and the other paths needed to buildhello
). I don't think that's a big issue since the round trips are on the depth of the graph rather than the total number of paths, but it is something we discussed earlier on in this issue :)yeah, there are still some bottlenecks in the process, but they're mostly related to search depth instead of being applied indiscriminately across the entire tree. we probably got most of the query performance out of it that we can without querying things we may not need to query (which comes with its own set of local overhead and cost to the cache being queried)
I also did a more apples-to-apples comparison (
nix build --dry-run
before and after877b0d7121
) and got less spectacular but still very satisfying results: