Unhelpful error message when a substituter is missing a dependency #716

Open
opened 2025-03-05 17:32:44 +00:00 by lheckemann · 2 comments
Member

Describe the bug

When trying to fetch the narinfo for a dependency from a substituter fails (i.e. the substituter isn't upholding the store invariants), this isn't reported clearly.

This was first observed with cachix (and doesn't occur reliably there -- maybe its sporadic 502s affect cache service as well?), but can also be reproduced with a synthetic test case.

Steps To Reproduce

rm -rf /tmp/binarycache /tmp/store
drv=$(nix-instantiate '<nixpkgs>' -A hello)
out=$(nix-store -q --outputs $drv)
nix build $drv^out
nix copy $out --to file:///tmp/binarycache
# remove a dependency of hello from the binary cache
rm -v $(grep -lE '^StorePath: .*(glibc|iconv)' /tmp/binarycache/*.narinfo)
nix-store --store /tmp/store --substituters file:///tmp/binarycache?trusted=1 -r $out --narinfo-cache-positive-ttl 1 --narinfo-cache-negative-ttl 1

The final command will fail something like this:

this path will be fetched (0.05 MiB download, 0.22 MiB unpacked):       
  /nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1              
don't know how to build these paths:                                           
  /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36             
error: build of '/nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1' failed

If the drv is copied to the chroot store and realised instead of its output path, the behaviour combines somewhat oddly, claiming that it doesn't know how to build the dependency and yet starting to do so anyway:

$ nix-store --store /tmp --substituters file:///tmp/binarycache?trusted=1 -r /nix/store/7s3p1q56abzik34w93kw7jx37l37ycg8-hello-2.12.1.drv --narinfo-cache-positive-ttl 1 --narinfo-cache-negative-ttl 1 --no-fallback                                                                                                    
this path will be fetched (0.05 MiB download, 0.22 MiB unpacked):
  /nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1
don't know how to build these paths:
  /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36
building '/nix/store/1crz16gbhsxjxkyp2zxjf5y7a224m4j7-bash52-033.drv'...
building '/nix/store/0m4y3j4pnivlhhpr5yqdvlly86p93fwc-busybox.drv'...
[...]

Expected behavior

A warning when a path that's expected in a binary cache (based on the cache invariant "all dependencies should be there"), ideally with details on how getting the narinfo failed.

nix --version output

nix (Lix, like Nix) 2.93.0-dev-pre20250224-945e674
System type: x86_64-linux
Additional system types: i686-linux, x86_64-v1-linux, x86_64-v2-linux, x86_64-v3-linux
Features: gc, signed-caches
System configuration file: /etc/nix/nix.conf
User configuration files: /home/linus/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/linus/.local/share/flatpak/exports/etc/xdg/nix/nix.conf:/var/lib/flatpak/exports/etc/xdg/nix/nix.conf:/home/linus/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/linus/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/linus/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf
Store directory: /nix/store
State directory: /nix/var/nix
Data directory: /nix/store/gnjcl840pc8wra0634mprj1zs176swqq-lix-2.93.0-dev-pre20250224-945e674/share

Additional context

It might be possible to trigger the same kind of failure without breaking the "all dependencies are present" invariant if through a somewhat convoluted combination of GC operations:

  • The narinfo for the dependent (hello) is present in the client's narinfo cache
  • The narinfo for the dependency (glibc or iconv) is not in the client's narinfo cache (glibc/iconv was already present on the client when the dependent was last queried)
  • The narinfo for the dependency is not present in the binary cache due to a GC.
## Describe the bug When trying to fetch the narinfo for a dependency from a substituter fails (i.e. the substituter isn't upholding the store invariants), this isn't reported clearly. This was first observed with cachix (and doesn't occur reliably there -- maybe its sporadic 502s affect cache service as well?), but can also be reproduced with a synthetic test case. ## Steps To Reproduce ``` rm -rf /tmp/binarycache /tmp/store drv=$(nix-instantiate '<nixpkgs>' -A hello) out=$(nix-store -q --outputs $drv) nix build $drv^out nix copy $out --to file:///tmp/binarycache # remove a dependency of hello from the binary cache rm -v $(grep -lE '^StorePath: .*(glibc|iconv)' /tmp/binarycache/*.narinfo) nix-store --store /tmp/store --substituters file:///tmp/binarycache?trusted=1 -r $out --narinfo-cache-positive-ttl 1 --narinfo-cache-negative-ttl 1 ``` The final command will fail something like this: ``` this path will be fetched (0.05 MiB download, 0.22 MiB unpacked): /nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1 don't know how to build these paths: /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36 error: build of '/nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1' failed ``` If the drv is copied to the chroot store and realised instead of its output path, the behaviour combines somewhat oddly, claiming that it doesn't know how to build the dependency and yet starting to do so anyway: ``` $ nix-store --store /tmp --substituters file:///tmp/binarycache?trusted=1 -r /nix/store/7s3p1q56abzik34w93kw7jx37l37ycg8-hello-2.12.1.drv --narinfo-cache-positive-ttl 1 --narinfo-cache-negative-ttl 1 --no-fallback this path will be fetched (0.05 MiB download, 0.22 MiB unpacked): /nix/store/crrf4bysvarqp0g9pyhcrj47689g4i5l-hello-2.12.1 don't know how to build these paths: /nix/store/nqb2ns2d1lahnd5ncwmn6k84qfd7vx2k-glibc-2.40-36 building '/nix/store/1crz16gbhsxjxkyp2zxjf5y7a224m4j7-bash52-033.drv'... building '/nix/store/0m4y3j4pnivlhhpr5yqdvlly86p93fwc-busybox.drv'... [...] ``` ## Expected behavior A warning when a path that's expected in a binary cache (based on the cache invariant "all dependencies should be there"), ideally with details on how getting the narinfo failed. ## `nix --version` output ``` nix (Lix, like Nix) 2.93.0-dev-pre20250224-945e674 System type: x86_64-linux Additional system types: i686-linux, x86_64-v1-linux, x86_64-v2-linux, x86_64-v3-linux Features: gc, signed-caches System configuration file: /etc/nix/nix.conf User configuration files: /home/linus/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/linus/.local/share/flatpak/exports/etc/xdg/nix/nix.conf:/var/lib/flatpak/exports/etc/xdg/nix/nix.conf:/home/linus/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/linus/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/linus/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf Store directory: /nix/store State directory: /nix/var/nix Data directory: /nix/store/gnjcl840pc8wra0634mprj1zs176swqq-lix-2.93.0-dev-pre20250224-945e674/share ``` ## Additional context It might be possible to trigger the same kind of failure without breaking the "all dependencies are present" invariant if through a somewhat convoluted combination of GC operations: - The narinfo for the dependent (hello) is present in the client's narinfo cache - The narinfo for the dependency (glibc or iconv) is _not_ in the client's narinfo cache (glibc/iconv was already present on the client when the dependent was last queried) - The narinfo for the dependency is _not_ present in the binary cache due to a GC.
jade added this to the Error reporting project 2025-03-05 19:11:35 +00:00
Member

I think this hits a case where a decently useful message is generated, but for some reason not surfaced as a warning? It's printed with -vvv due to this debug call, but that doesn't really count. I tried to figure out if the error case of the Goal::WorkResult is handled, but I'm a bit too unfamiliar with the codebase that I couldn't find where that would be..

I think this hits [a case][1] where a decently useful message is generated, but for some reason not surfaced as a warning? It's printed with `-vvv` due to [this `debug` call][2], but that doesn't really count. I tried to figure out if the error case of the `Goal::WorkResult` is handled, but I'm a bit too unfamiliar with the codebase that I couldn't find where that would be.. [1]: https://git.lix.systems/lix-project/lix/src/branch/main/lix/libstore/build/substitution-goal.cc#L91-L94 [2]: https://git.lix.systems/lix-project/lix/src/branch/main/lix/libstore/build/substitution-goal.cc#L44
Author
Member
See also https://github.com/NixOS/nix/issues/12610
Sign in to join this conversation.
No milestone
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#716
No description provided.