Enter automatically upon failure into the builder's namespace #828

Open
opened 2025-05-11 22:35:31 +00:00 by raito · 1 comment
Owner

When debugging failed Nix builds, it’s often frustrating to manually run nix-shell or nix develop into the build environment just to reproduce or inspect the failure. This is especially annoying with large or complex derivations, where setting up the same environment manually is error-prone. Currently, workarounds are to use tools like nix-build --keep-failed and then manually inspect the environment, which doesn't capture the full namespace isolation that happened during the build.

Describe the solution you'd like

I’d like a feature where, upon a failed build, the Nix CLI can automatically enter the failed derivation’s namespace (e.g., via nsenter, cntr or similar) and drop into an interactive shell. This would allow developers to instantly inspect the exact environment and filesystem at the moment of failure, improving debugging speed and fidelity. The feature could be enabled with a flag like --enter-on-failure.

Describe alternatives you've considered

Additional context

There are a few known blockers to implementing this feature (please reach out in this issue if you are interested to help tackle some items):

  1. Builder cooperation: In order to preserve non-filesystem state like environment variables, the builder must not exit immediately upon failure. It needs to pause and offer an opportunity to attach a shell, which requires modifying existing build hooks (read: nixpkgs stdenv support scripts) to support cooperative failure behavior.

  2. Nix daemon limitations: If the feature is to be triggered automatically from the Lix CLI, especially when the build occurs on a remote builder or is managed locally by the Nix daemon, it would require modifying the Nix daemon protocol to allow shell attachment into a build's namespace. However, this protocol is tightly constrained and difficult to extend, which is one of the motivations behind the ongoing work to develop a better IPC protocol with Cap'n'Proto.

To support this feature, we would need new IPC wires that allow the Lix CLI (or another client) to attach to any build environment, local or remote, after failure.

## Is your feature request related to a problem? Please describe. When debugging failed Nix builds, it’s often frustrating to manually run `nix-shell` or `nix develop` into the build environment just to reproduce or inspect the failure. This is especially annoying with large or complex derivations, where setting up the same environment manually is error-prone. Currently, workarounds are to use tools like `nix-build --keep-failed` and then manually inspect the environment, which doesn't capture the full namespace isolation that happened during the build. ## Describe the solution you'd like I’d like a feature where, upon a failed build, the Nix CLI can automatically enter the failed derivation’s namespace (e.g., via `nsenter`, `cntr` or similar) and drop into an interactive shell. This would allow developers to instantly inspect the exact environment and filesystem at the moment of failure, improving debugging speed and fidelity. The feature could be enabled with a flag like `--enter-on-failure`. ## Describe alternatives you've considered * Manually using `nix-shell` with similar inputs, which is slow and often incomplete because it requires to replay the stdenv dance as documented in https://jade.fyi/blog/building-nix-derivations-manually/. * Using `nix develop`, but this doesn’t reproduce the full isolation context of the build. * Creating a custom builder that invokes an interactive shell on failure, which is non-trivial and inconsistent across systems, but available under https://nixos.org/manual/nixpkgs/stable/#breakpointhook. ## Additional context There are a few known blockers to implementing this feature (please reach out in this issue if you are interested to help tackle some items): 1. **Builder cooperation**: In order to preserve non-filesystem state like environment variables, the builder must not exit immediately upon failure. It needs to pause and offer an opportunity to attach a shell, which requires modifying existing build hooks (read: nixpkgs stdenv support scripts) to support cooperative failure behavior. 2. **Nix daemon limitations**: If the feature is to be triggered automatically from the Lix CLI, especially when the build occurs on a remote builder or is managed locally by the Nix daemon, it would require modifying the Nix daemon protocol to allow shell attachment into a build's namespace. However, this protocol is tightly constrained and difficult to extend, which is one of the motivations behind the ongoing work to develop a better IPC protocol with Cap'n'Proto. To support this feature, we would need new IPC wires that allow the Lix CLI (or another client) to attach to any build environment, local or remote, after failure.
Owner

I suspect this requires a channel between the main daemon and any child daemons since currently there's not really any way to contact a particular client process's daemon. I think that might be where we'd have to start on this, just bringing that channel up.

I suspect this requires a channel between the main daemon and any child daemons since currently there's not really any way to contact a particular client process's daemon. I think that might be where we'd have to start on this, just bringing that channel up.
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#828
No description provided.