For a typical desktop system (~2K packages) we can easily get 100K
entries in RealisationsRefs. Without indices query for RealisationsRefs
requires linear scan.
RealisationsRefs(referrer)
--------------------------
Inefficiency is seen as a 100% CPU load of nix-daemon for the following
scenario:
$ nix edit -f . bash # add unused environment variable, like FOO="1"
# populate RealisationsRefs, build fresh system
$ nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'
$ nix edit -f . bash # add unused environment variable, like FOO="2"
$ time nix build -f nixos system --arg config '{ contentAddressedByDefault = true; }'
In this case `bash `will be rebuilt a few times and then rest of CPU
time is spent on scanning RealisationsRefs table (about 5 CPU-minutes
on my machine).
Before the change:
$ time nix build -f nixos system ... # step 4 above
real 34m3,613s
user 0m5,232s
sys 0m0,758s
Of all this time about 29.5 minutes are taken by nix-daemon's CPU time.
After the change:
$ time nix build -f nixos system ... # step 4 above
real 4m50,061s
user 0m5,038s
sys 0m0,677s
Of all this time about 1 minute is taken by nix-daemon's CPU time.
Most of the time is spent polling for non-existent realisations on
cache-nixos.org.
Realisations(outputPath)
------------------------
After running CA system for two weeks I got ~1M entries in Realisations
table. `nix-collect-garbage` became very slow (seemingly 100 path deletions
per second). It happens due to a slow cascading delete from Realisations
triggered by deletion from ValidPaths.
The fix is to add an index on primary key from ValidPaths(id) that
triggers cascading deletions.
Before the change:
$ time nix-collect-garbage -d --max-freed 100G
<interrupted before finish, took too long>
real 23m32.411s
user 17m49.679s
sys 4m50.609s
Most of time was spent in re-scanning Realisations table on each path deletion.
After the change:
$ time nix-collect-garbage -d --max-freed 100G
real 8m43.226s
user 6m16.317s
sys 1m40.188s
Time is spent scanning sqlite indices and in kernel when unlinking directories.
Doing it as a side-effect of calling LocalStore::makeStoreWritable()
is very ugly.
Also, make sure that stopping the progress bar joins the update
thread, otherwise that thread should be unshared as well.
Since 4806f2f6b0, we can't have paths with
references passed to builtins.{path,filterSource}. This prevents many cases
of those functions called on IFD outputs from working. Resolve this by
passing the references found in the original path to the added path.
Rather than having them plain strings scattered through the whole
codebase, create an enum containing all the known experimental features.
This means that
- Nix can now `warn` when an unkwown experimental feature is passed
(making it much nicer to spot typos and spot deprecated features)
- It’s now easy to remove a feature altogether (once the feature isn’t
experimental anymore or is dropped) by just removing the field for the
enum and letting the compiler point us to all the now invalid usages
of it.
This ensures any started processes can't write to /nix/store (except
during builds). This partially reverts 01d07b1e, which happened because
of #2646.
The problem was only happening after nix downloads anything, causing
me to suspect the download thread. The problem turns out to be:
"A process can't join a new mount namespace if it is sharing
filesystem-related attributes with another process", in this case this
process is the curl thread.
Ideally, we might kill it before spawning the shell process, but it's
inside a static variable in the getFileTransfer() function. So
instead, stop it from sharing FS state using unshare(). A strategy
such as the one from #5057 (single-threaded chroot helper binary) is
also very much on the table.
Fixes#4337.
This fixes a bug in the garbage collector where if a path
/nix/store/abcd-foo is valid, but we do a
isValidPath("/nix/store/abcd-foo.lock") first, then a negative entry
for /nix/store/abcd is added to pathInfoCache, so /nix/store/abcd-foo
is subsequently considered invalid and deleted.
The garbage collector no longer blocks other processes from
adding/building store paths or adding GC roots. To prevent the
collector from deleting store paths just added by another process,
processes need to connect to the garbage collector via a Unix domain
socket to register new temporary roots.
I had started the trend of doing `std::visit` by value (because a type
error once mislead me into thinking that was the only form that
existed). While the optomizer in principle should be able to deal with
extra coppying or extra indirection once the lambdas inlined, sticking
with by reference is the conventional default. I hope this might even
improve performance.
- This can legitimately happen (for example because of a non-determinism
causing a build-time dependency to be kept or not as a runtime
reference)
- Because of older Nix versions, it can happen that we encounter a
realisation with an (erroneously) empty set of dependencies, in which
case we don’t want to fail, but just warn the user and try to fix it.
Useful when we're using a daemon with a chroot store, e.g.
$ NIX_DAEMON_SOCKET_PATH=/tmp/chroot/nix/var/nix/daemon-socket/socket nix-daemon --store /tmp/chroot
Then the client can now connect with
$ nix build --store unix:///tmp/chroot/nix/var/nix/daemon-socket/socket?root=/tmp/chroot nixpkgs#hello
When adding a path to the local store (via `LocalStore::addToStore`),
ensure that the `ca` field of the provided `ValidPathInfo` does indeed
correspond to the content of the path.
Otherwise any untrusted user (or any binary cache) can add arbitrary
content-addressed paths to the store (as content-addressed paths don’t
need a signature).
I guess the rationale behind the old name wath that
`pathInfoIsTrusted(info)` returns `true` iff we would need to `blindly`
trust the path (because it has no valid signature and `requireSigs` is
set), but I find it to be a really confusing footgun because it's quite
natural to give it the opposite meaning.
Once a build is done, get back to the original derivation, and register
all the newly built outputs for this derivation.
This allows Nix to work properly with derivations that don't have all
their build inputs available − thus allowing garbage collection and
(once it's implemented) binary substitution
Changes:
* The divider lines are gone. These were in practice a bit confusing,
in particular with --show-trace or --keep-going, since then there
were multiple lines, suggesting a start/end which wasn't the case.
* Instead, multi-line error messages are now indented to align with
the prefix (e.g. "error: ").
* The 'description' field is gone since we weren't really using it.
* 'hint' is renamed to 'msg' since it really wasn't a hint.
* The error is now printed *before* the location info.
* The 'name' field is no longer printed since most of the time it
wasn't very useful since it was just the name of the exception (like
EvalError). Ideally in the future this would be a unique, easily
googleable error ID (like rustc).
* "trace:" is now just "…". This assumes error contexts start with
something like "while doing X".
Example before:
error: --- AssertionError ---------------------------------------------------------------------------------------- nix
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
assertion 'false' failed
----------------------------------------------------- show-trace -----------------------------------------------------
trace: while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
Example after:
error: assertion 'false' failed
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
… while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
With the `ca-derivation` experimental features, non-ca derivations used
to have their output paths returned as unknown as long as they weren't
built (because of a mistake in the code that systematically erased the
previous value)
Thanks @regnat and @edolstra for catching this and comming up with the
solution.
They way I had generalized those is wrong, because local settings for
non-local stores is confusing default. And due to the nature of C++
inheritance, fixing the defaults is more annoying than it should be.
Additionally, I thought we might just drop the check in the substitution
logic since `Store::addToStore` is now streaming, but @regnat rightfully
pointed out that as it downloads dependencies first, that would still be
too late, and also waste effort on possibly unneeded/unwanted
dependencies.
The simple and correct thing to do is just make a store method for the
boolean logic, keeping all the setting and key stuff the way it was
before. That new method is both used by `LocalStore::addToStore` and the
substitution goal check. Perhaps we might eventually make it fancier,
e.g. sending the ValidPathInfo to remote stores for them to validate,
but this is good enough for now.
We embrace virtual the rest of the way, and get rid of the
`assert(false)` 0-param constructors.
We also list config base classes first, so the constructor order is
always:
1. all the configs
2. all the stores
Each in the same order
PRs #4370 and #4348 had a bad interaction in that the second broke the fist
one in a not trivial way.
The issue was that since #4348 the logic for detecting whether a
derivation output is already built requires some logic that was specific
to the `LocalStore`.
It happens though that most of this logic could be upstreamed to any `Store`,
which is what this commit does.