RFD: Remove recursive-nix #767

Closed
opened 2025-03-25 23:54:08 +00:00 by jade · 14 comments
Owner

We want to remove the recursive-nix experimental feature and all associated code. Recursive Nix is a feature where the derivation builder creates a Nix daemon socket that through a complex series of operations which is running against the host Nix store. The purpose of this is that you can do nix evaluations and nix builds inside of a derivation and have the results appear in the outer nix store.

It's used for dynamic derivations, where a derivation produces a build plan which references other store paths, which currently have to use recursive-nix to create said store paths. This is not the final design and CppNix is driving the dev process of dyndrvs, not us; it would be replaced with varlink or something else before stabilization.

Here's why:

  • It puts the legacy nix daemon protocol as part of stable derivation ABI. This is unacceptable and means it will never be stabilized.
  • There is broad agreement including with the CppNix team that dynamic-derivations will NOT ship without a replacement to recursive-nix, so there is no compat hazard to removing it except for people who are trying to do dyndrvs on lix (haven't heard of anyone).
  • It's not "can you run nix inside a derivation builder". You can do that still. You can even do that with nested user namespaces and a fake nix store if you really need to.
  • It's a structural problem when we want to refactor the sandbox setup code to not know what a nix is, which we want to do to be able to simplify it and maybe even RIIR.

Is anyone using recursive-nix on Lix? What are you using it for? Is there some way that we can accommodate your use case?

CL: https://gerrit.lix.systems/c/lix/+/2872

We want to remove the `recursive-nix` experimental feature and all associated code. Recursive Nix is a feature where the derivation builder creates a Nix daemon socket that through a complex series of operations which is running against the *host* Nix store. The purpose of this is that you can do nix evaluations and nix builds inside of a derivation and have the results appear in the outer nix store. It's used for dynamic derivations, where a derivation produces a build plan which references other store paths, which currently have to use recursive-nix to create said store paths. This is not the final design and CppNix is driving the dev process of dyndrvs, not us; it would be replaced with varlink or something else before stabilization. Here's why: - It puts the legacy nix daemon protocol as part of stable derivation ABI. This is unacceptable and means it will *never* be stabilized. - There is broad agreement including with the CppNix team that dynamic-derivations will NOT ship without a replacement to `recursive-nix`, so there is no compat hazard to removing it except for people who are trying to do dyndrvs on lix (haven't heard of anyone). - It's *not* "can you run nix inside a derivation builder". You can do that still. You can even do that with nested user namespaces and a fake nix store if you really need to. - It's a structural problem when we want to refactor the sandbox setup code to not know what a nix is, which we want to do to be able to simplify it and maybe even RIIR. Is anyone using recursive-nix on Lix? What are you using it for? Is there some way that we can accommodate your use case? CL: https://gerrit.lix.systems/c/lix/+/2872
Member

This issue was mentioned on Gerrit on the following CLs:

  • comment in cl/2872 ("chore: drop experimental feature recursive-nix")
  • commit message in cl/2872 ("chore: drop experimental feature recursive-nix")
<!-- GERRIT_LINKBOT: {"cls": [{"backlink": "https://gerrit.lix.systems/c/lix/+/2872", "number": 2872, "kind": "comment"}, {"backlink": "https://gerrit.lix.systems/c/lix/+/2872", "number": 2872, "kind": "commit message"}], "cl_meta": {"2872": {"change_title": "chore: drop experimental feature `recursive-nix`"}}} --> This issue was mentioned on Gerrit on the following CLs: * comment in [cl/2872](https://gerrit.lix.systems/c/lix/+/2872) ("chore: drop experimental feature `recursive-nix`") * commit message in [cl/2872](https://gerrit.lix.systems/c/lix/+/2872) ("chore: drop experimental feature `recursive-nix`")
Owner

except for people who are trying to do dyndrvs on lix (haven't heard of anyone).

considering that it's still disabled as it is on CppNix, and even if it wasn't it's still probably bugged, honestly we could remove both at this point. Bit of a hot take though, but I suspect we'll want to do something different for dyndrvs anyway.

> except for people who are trying to do dyndrvs on lix (haven't heard of anyone). considering that it's still disabled as it is on CppNix, and even if it wasn't it's still probably bugged, honestly we could remove both at this point. Bit of a hot take though, but I suspect we'll want to do something different for dyndrvs anyway.
Owner

recursive nix has been such a huge pain in our tail for anything async-related, it isn't (can't be?) implemented an platforms that aren't linux without completely disabling the sandbox (and thus invalidating the entire concept). let's get rid of it.

recursive nix has been *such* a huge pain in our tail for anything async-related, it isn't (can't be?) implemented an platforms that aren't linux without completely disabling the sandbox (and thus invalidating the entire concept). let's get rid of it.
Member

For me it's always a this (recursive-nix) feature looks interesting, and somehow teaching tools to create smaller derivation to cache through nix derivations sounds interesting. But never actually used it as there is not really tooling around it and so never got around to it. So sad to see it go, but also will not actually miss anything

For me it's always a this (recursive-nix) feature looks interesting, and somehow teaching tools to create smaller derivation to cache through nix derivations sounds interesting. But never actually used it as there is not really tooling around it and so never got around to it. So sad to see it go, but also will not actually miss anything

Since I'm not very active in community spaces I only noticed this because my pipelines against Lix main started failing all of a sudden, so I don't really know where else to ask; with the removal of this feature is there still a way to run tools like nix-diff (generally everything that has to query the store to produce its output) inside a derivation?
I've been using this for a while now with recursive-nix to diff two .drvs, which I did struggle to emulate otherwise before opting for recursive-nix.

Since I'm not very active in community spaces I only noticed this because my pipelines against Lix *main* started failing all of a sudden, so I don't really know where else to ask; with the removal of this feature is there still a way to run tools like *nix-diff* (generally everything that has to query the store to produce its output) inside a derivation? I've been using this for a while now with *recursive-nix* to diff two `.drv`s, which I did struggle to emulate otherwise before opting for *recursive-nix*.
Author
Owner

yes, you can use --store local or NIX_REMOTE=local or similar, theoretically, which should convince the lix to not try the daemon. i haven't tested this though. it might blow up for bad reasons.

yes, you can use --store local or NIX_REMOTE=local or similar, theoretically, which should convince the lix to not try the daemon. i haven't tested this though. it might blow up for bad reasons.
Owner

nixos itself uses nix in the sandbox with a store to build its option docs: 3afc8f4712/nixos/modules/misc/documentation.nix (L131)

nixos itself uses nix in the sandbox with a store to build its option docs: https://github.com/NixOS/nixpkgs/blob/3afc8f47128378578094bdd00e8d2e0c9eb0f307/nixos/modules/misc/documentation.nix#L131

Thanks everyone for the input.

After taking a stab at it again I remember what wall I ran into back then; while nix-diff seems to work just fine without recursive-nix (as long as you throw it against the .drv files directly), nvd calls the Nix CLI to query about references. Since /nix/var doesn't exist to begin with thore are parts of the tooling which fall face flat with EPERM trying to create it, and bending NIX_STATE_DIR yields a writable but empty storedb giving either "not in store" or "invalid" for the store paths and their .drv files respectively.
Moving the entire NIX_STORE_DIR as per № 3 of the OP of course would mean rebuilding which is not feasible in that scenario.

So the workaround for all of this is a mix of direct .drv files for nix-diff and loading the store path from the closure info in the same way NixOS ISO/squashfs builds rebuild the store from the closure info paths embedded in the squashfs for nvd (though I guess this'd also work for nix-diff.…?).

(just in case anyone opens this issue trying to achieve something similar; using NIX_STATE_DIR and nix-store --load-db should be able to get you most of the way for a lot of applications)

With this change I can actually rip recursive-nix out of my cursed setup making it ever so slightly less cursed.
Thanks everyone.

Thanks everyone for the input. After taking a stab at it again I remember what wall I ran into back then; while *nix-diff* seems to work just fine without *recursive-nix* (as long as you throw it against the `.drv` files **directly**), *nvd* calls the Nix CLI to [query about references](https://git.sr.ht/~khumba/nvd/tree/b1f5f07a3bf0c6212a03600b875ea512166e32d8/item/src/nvd#L298). Since */nix/var* doesn't exist to begin with thore are parts of the tooling which fall face flat with EPERM trying to create it, and bending *NIX_STATE_DIR* yields a writable but empty storedb giving either "not in store" or "invalid" for the store paths and their `.drv` files respectively. Moving the entire *NIX_STORE_DIR* as per [№ 3 of the OP](https://git.lix.systems/lix-project/lix/issues/767#issue-13149) of course would mean rebuilding which is not feasible in that scenario. So the workaround for all of this is a mix of direct `.drv` files for *nix-diff* and loading the store path from the closure info in the same way [NixOS ISO/squashfs builds rebuild the store](https://github.com/NixOS/nixpkgs/blob/494e8180e3ea322088dbfe5fee714888c902be27/nixos/modules/installer/cd-dvd/iso-image.nix#L1012) from the [closure info paths embedded in the squashfs](https://github.com/NixOS/nixpkgs/blob/2debd4ebfe3a0aa1b2a8cc6b400cbe06675b34b3/nixos/lib/make-btrfs-fs.nix#L25) for *nvd* (though I guess this'd also work for *nix-diff*.…?). (just in case anyone opens this issue trying to achieve something similar; using *NIX_STATE_DIR* and `nix-store --load-db` should be able to get you most of the way for a lot of applications) With [this change](https://git.shell.bsocat.net/infra/commit/?h=e431269f72be653b4ae5632818e63885e4fd4e1c) I can actually rip *recursive-nix* out of my cursed setup making it ever so slightly less cursed. Thanks everyone.
Owner

This was finally done, feel free to comment after this issue is closed if you want to add your remarks. We are aware of a distribution in https://discourse.nixos.org/t/opinionated-patchset-with-recursive-nix-planned-for-lix/63494 that plans to offer recursive-nix back.
Check them out if you really need that feature on the top of Lix.

This was finally done, feel free to comment after this issue is closed if you want to add your remarks. We are aware of a distribution in https://discourse.nixos.org/t/opinionated-patchset-with-recursive-nix-planned-for-lix/63494 that plans to offer `recursive-nix` back. Check them out if you really need that feature on the top of Lix.
raito closed this issue 2025-04-28 21:01:29 +00:00
Member

I'm in the process of switching my organization over to lix now, and we a have been actively using recursive-nix. We evaluate a large number of configurations in CI to check that they still evaluate for a PR and I was using recursive-nix to offload that work to our farm of remote builders, in a way that integrates nicely with our CI and it simplifies managing resources on our builders because it just uses build slots like everything else.

I'm in the process of switching my organization over to lix now, and we a have been actively using recursive-nix. We evaluate a large number of configurations in CI to check that they still evaluate for a PR and I was using recursive-nix to offload that work to our farm of remote builders, in a way that integrates nicely with our CI and it simplifies managing resources on our builders because it just uses build slots like everything else.
Owner

@bacchanalia wrote in #767 (comment):

I'm in the process of switching my organization over to lix now, and we a have been actively using recursive-nix. We evaluate a large number of configurations in CI to check that they still evaluate for a PR and I was using recursive-nix to offload that work to our farm of remote builders, in a way that integrates nicely with our CI and it simplifies managing resources on our builders because it just uses build slots like everything else.

I'm not deeply familiar with your CI system, but would you know why one of the following two options are not feasible in your case:

  • remove the use of nix build in the entrypoint of your CI so that you only run the inner nix build wrapped in the derivations?
  • add an impure sandbox path via /etc/nix/nix.conf pointing to the Nix's ambient daemon so that you can run nix build inside will reuse the parent daemon socket, this is similar to what recursive-nix does without all the store paths restriction and special casing that you do not seem to be using

Let us know if we are still missing a story in our understanding of your usage.

@bacchanalia wrote in https://git.lix.systems/lix-project/lix/issues/767#issuecomment-12318: > I'm in the process of switching my organization over to lix now, and we a have been actively using recursive-nix. We evaluate a large number of configurations in CI to check that they still evaluate for a PR and I was using recursive-nix to offload that work to our farm of remote builders, in a way that integrates nicely with our CI and it simplifies managing resources on our builders because it just uses build slots like everything else. I'm not deeply familiar with your CI system, but would you know why one of the following two options are not feasible in your case: - remove the use of `nix build` in the entrypoint of your CI so that you only run the inner `nix build` wrapped in the derivations? - add an impure sandbox path via `/etc/nix/nix.conf` pointing to the Nix's ambient daemon so that you can run `nix build` inside will reuse the parent daemon socket, this is similar to what `recursive-nix` does without all the store paths restriction and special casing that you do not seem to be using Let us know if we are still missing a story in our understanding of your usage.
Member

@raito I was trying to get eval working within a build by exposing the socket to the sandbox as you suggested, but I'm getting
error: creating directory '/nix/var': Permission denied even if I include all the way up to /nix/var as an extra-sandbox-path.
(also, I'm not sure what you mean by the first suggestion?)

@raito I was trying to get eval working within a build by exposing the socket to the sandbox as you suggested, but I'm getting `error: creating directory '/nix/var': Permission denied` even if I include all the way up to `/nix/var` as an extra-sandbox-path. (also, I'm not sure what you mean by the first suggestion?)
Author
Owner

/nix/var should be something that's overridable by environment variable, check the hastily written environment variable list in testing.md

/nix/var should be something that's overridable by environment variable, check the hastily written environment variable list in testing.md
Member

I managed to get it working after removing the group check in daemon.cc, setting noChroot = true; on the derivation. I was unable to get it to work with extra-sandbox-paths, I think due to paths required for eval not being available in the chroot.

I managed to get it working after removing the group check in [daemon.cc](https://git.lix.systems/lix-project/lix/src/branch/main/lix/nix/daemon.cc#L264), setting `noChroot = true;` on the derivation. I was unable to get it to work with `extra-sandbox-paths`, I think due to paths required for eval not being available in the chroot.
Sign in to join this conversation.
No milestone
No project
No assignees
8 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#767
No description provided.