Enter automatically upon failure into the builder's namespace #828
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#828
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is your feature request related to a problem? Please describe.
When debugging failed Nix builds, it’s often frustrating to manually run
nix-shell
ornix develop
into the build environment just to reproduce or inspect the failure. This is especially annoying with large or complex derivations, where setting up the same environment manually is error-prone. Currently, workarounds are to use tools likenix-build --keep-failed
and then manually inspect the environment, which doesn't capture the full namespace isolation that happened during the build.Describe the solution you'd like
I’d like a feature where, upon a failed build, the Nix CLI can automatically enter the failed derivation’s namespace (e.g., via
nsenter
,cntr
or similar) and drop into an interactive shell. This would allow developers to instantly inspect the exact environment and filesystem at the moment of failure, improving debugging speed and fidelity. The feature could be enabled with a flag like--enter-on-failure
.Describe alternatives you've considered
nix-shell
with similar inputs, which is slow and often incomplete because it requires to replay the stdenv dance as documented in https://jade.fyi/blog/building-nix-derivations-manually/.nix develop
, but this doesn’t reproduce the full isolation context of the build.Additional context
There are a few known blockers to implementing this feature (please reach out in this issue if you are interested to help tackle some items):
Builder cooperation: In order to preserve non-filesystem state like environment variables, the builder must not exit immediately upon failure. It needs to pause and offer an opportunity to attach a shell, which requires modifying existing build hooks (read: nixpkgs stdenv support scripts) to support cooperative failure behavior.
Nix daemon limitations: If the feature is to be triggered automatically from the Lix CLI, especially when the build occurs on a remote builder or is managed locally by the Nix daemon, it would require modifying the Nix daemon protocol to allow shell attachment into a build's namespace. However, this protocol is tightly constrained and difficult to extend, which is one of the motivations behind the ongoing work to develop a better IPC protocol with Cap'n'Proto.
To support this feature, we would need new IPC wires that allow the Lix CLI (or another client) to attach to any build environment, local or remote, after failure.
I suspect this requires a channel between the main daemon and any child daemons since currently there's not really any way to contact a particular client process's daemon. I think that might be where we'd have to start on this, just bringing that channel up.