nix-daemon breaks mysteriously when it runs out of disk space #429
Labels
No labels
Area/build-packaging
Area/evaluator
Area/flakes
Area/language
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/store
bug
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
RFD
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#429
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
A rebuild was downloading a bunch of derivations, and filled my desktop's ZFS pool (0B free according to
zfs list
). The daemon's reaction to this was mysterious: the derivation downloads stalled down to 0B/s and stayed there, without reporting any issue or failing the build.Meanwhile, in other terminals, all nix commands that require talking to nix-daemon failed with a variety of mysterious daemon connection failures:
I freed up some space to get an amount of free bytes again, and commands started working normally again.
My first attempt to repro was by setting a quota on /nix only, with
zfs set quota=127G data/nix
and fetching a big derivation to push it across the finish line, but that failed with an obvious hint:After that I repro'd more violently by just filling the zpool with /dev/urandom, and got the mysterious lockup again. During the two "really out of bytes" episodes, nix-daemon logs say:
once for each command I attempted to run. Similar error when I broke /nix via quotas:
Steps To Reproduce
nix build
andnix shell
commands in other terminals fail with a mysterious errorI'm able to trigger this on demand, if there's any verbose logspam I can enable or any evidence you'd like, given it's a mildly esoteric setup.
Expected behavior
It'd be nice to get a more explicit error that points me at the full disk, if possible.
nix --version output
nix (Lix, like Nix) 2.91.0-dev-pre20240702-45ac449
oh yes, that really should be much, much clearer. thanks very much for reporting it.