Unnecesary(?) double-canonicalisePathMetaData in LocalDerivationGoal::registerOutputs #786
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
4 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#786
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
LocalDerivationGoal::registerOutputs
callscanonicalisePathMetaData
twice after building normal(tm) derivations, wasting time on directory traversals/metadata gathering.Steps To Reproduce
script.bt
/nix/store/xxx-lix2.93-xxx
bpftrace script.bt
Expected behavior
Nix performs the minimum necessary directory traversals (and especially listxattr operations, which I believe tend to take out an FS-wide lock in the kernel, so could not be parallelised, even if other traversal tasks were).
nix --version
outputAdditional context
There are three call sites to
canonicalisePathMetaData
inregisterOutputs
:As a certified rust girlie I am somewhat overwhelmed by C++ control flow inside registerOutputs (and don't fully understand which type of CA derivation the comments are discussing at a given moment), but my guess is that the second or third call site is redundant, and that the third could be made conditional on whether any rewriting/other activities(?) have occurred beforehand.
Motivation
I am interested in the overall performance of adding a new package to my
environment.systemPackages
(currently it is slow enough that I instead reach fornix-shell -p
, and then don't have the package next time I want it). This involves the following builds:The breakdown of this process is, approximately:
nix-instantiate
nix-build
execve
toexit
pkgs/build-support/buildenv/builder.pl
...::buildDone
and::done
, which fixing this issue would shave ~180ms off::inputsRealised
and::runChild
::runChild
andexecve
(sandbox setup)I'm giving this issue an attempt. I'll hopefully post a Gerrit CL once I have something that passes tests.
This issue was mentioned on Gerrit on the following CLs:
I made an attempt, under the approach of canonicalize file metadata once and then make all operations afterwards maintain that state by setting the metadata on the files they touch, at
cl/3045
. It's basic approach was rejected as a bad idea because the CA derivation system is far too fragile, so the final canonicalize call should stay unless CA derivations get unfragiled or ripped out entirely. I agree with the comment before the first callthat it really is necessary. If the first call is necessary and the last is a bad idea to remove, and this issue is about it being called twice, that doesn't really seem to leave other options. I wrote another far more limited change that tries to only skip the final canonicalization call for input addressed derivations where no rewriting needed to occur, so it doesn't need to touch anything else, but it's still fundamentally the same approach that got rejected. I've given up on this and removed myself from this issue.
that one is more plausibly sound, but we'd want more assurance that no rewrites had happened. the only reasonable way to do this is to produce a fresh, rewritten path when rewriting is done rather than rewriting in place, and canonicalizing the new path before deleting the rewrite input and moving the canonicalize path into place. doing this also somewhat requires refactoring
registerOutputs
to not contain most of its code in capturing lambdas because whoever wrote it couldn't be bothered to make it the least bit maintainable :(