inconsistent/duplicate lix in hydra closure #18

Closed
opened 2025-01-20 13:36:48 +00:00 by benaryorg · 5 comments
Contributor

I am not sure where exactly this originates from given it involves three repos (lix and its nixos-module, hydra, and nix-eval-jobs) and two overlays (again, lix and hydra) and I haven't gotten around to build a minimal reproducer yet since I only noticed this yesterday when upgrading from Lix 2.91 to 2.92 (and given my current irl schedule might not get around to do that for 6 weeks >.<).
There is a very much not minimal reproducer over here though (IPv6 only): https://hydra.shell.bsocat.net/build/27039

The issue happens when using the above two overlays at the same time (probably, as said, no minimal reproducer yet).
The hydra closure on my end produces two different lix builds as dependencies;

  • lix-2.92.0-pre20250118-0795280
  • lix-2.92.0pre20250118_0795280

which diff like this (nix-diff /nix/store/8lmfrx837dyx5ai3m0lg9plzmllm0n5h-lix-2.92.0-pre20250118-0795280.drv /nix/store/5d5idq5chw6i2cdki9b99wpwj96sh6rm-lix-2.92.0pre20250118_0795280.drv):

- /nix/store/8lmfrx837dyx5ai3m0lg9plzmllm0n5h-lix-2.92.0-pre20250118-0795280.drv:{out}
+ /nix/store/5d5idq5chw6i2cdki9b99wpwj96sh6rm-lix-2.92.0pre20250118_0795280.drv:{out}
• The environments do not match:
    VERSION_SUFFIX=←-pre20250118-0795280←→pre20250118_0795280→
    name=←lix-2.92.0-pre20250118-0795280←→lix-2.92.0pre20250118_0795280→
    version=←2.92.0-pre20250118-0795280←→2.92.0pre20250118_0795280→

Yes, the dash before pre is the only difference, so the versions are virtually identical, which means it's not urgent by any means, however it points at some inconsistency when using the overlays that would be nice to have resolved (so that building two versions of Lix per architecture can be avoided).
The one without a dash is pulled in only by nix-eval-jobs (in turn pulled in by hydra), while the one with dash is used for everything else (including hydra and its perl bindings).

It looks to me that this is because I am using the flake version which in its overlay has a dependency on the package output of the nix-eval-jobs flake since there is no overlay for the lix version of nix-eval-jobs in its respective repository.
However since the lix overlay itself overrides nix-eval-jobs one already I'm thinking that maybe that one reference to the package output may cause the issue in my case, and there may be a general issue of the overlay and flake output of something producing slightly different variations of the same package.


For reference, my current workaround adds an overlay like the following which seems to replace the flake lix with the overlay lix for nix-eval-jobs.

nix-eval-jobs = prev.nix-eval-jobs.override { inherit (final) nix; };

I'll try to get a minimal reproducer going but given my scarce availability in the near future I wanted to report this in case it causes issues elsewhere.


As for a possible resolution, I think making the hydra overlay depend on the lix overlay would be a good solution since you'd likely want the nix-daemon to be lix too, and the lix overlay would already override the nix-eval-jobs properly.
This would increase the number of dependency flakes slightly, but in a more clean way IMHO (a dependency tree rather than a splat version of it accumulated in the hydra flake).
Alternatively giving nix-eval-jobs its own overlay may be a solution but that seems wildly overkill given that the lix overlay can already do that.
Either of these would make it so the overlays depend on the same version of lix, consistently.

A second part would be to figure out where the dash-less version of the package is coming from exactly, since that one (IMHO) shouldn't exist in the firstplace (which would've hidden the above issue until someone used overlays to override lix), since the flake output and the overlay output probably should have the exact same version specification when provided with the same source.

I am not sure where *exactly* this originates from given it involves three repos ([lix](https://git.lix.systems/lix-project/lix) and its [nixos-module](https://git.lix.systems/lix-project/nixos-module), [hydra](https://git.lix.systems/lix-project/hydra), and [nix-eval-jobs](https://git.lix.systems/lix-project/nix-eval-jobs)) and two overlays (again, lix and hydra) and I haven't gotten around to build a minimal reproducer yet since I only noticed this yesterday when upgrading from Lix 2.91 to 2.92 (and given my current irl schedule might not get around to do that for 6 weeks >.<). There is a very much not minimal reproducer over here though (IPv6 only): https://hydra.shell.bsocat.net/build/27039 The issue happens when using the above two overlays at the same time (probably, as said, no minimal reproducer yet). The hydra closure on my end produces two different lix builds as dependencies; - lix-2.92.0-pre20250118-0795280 - lix-2.92.0pre20250118_0795280 which diff like this (`nix-diff /nix/store/8lmfrx837dyx5ai3m0lg9plzmllm0n5h-lix-2.92.0-pre20250118-0795280.drv /nix/store/5d5idq5chw6i2cdki9b99wpwj96sh6rm-lix-2.92.0pre20250118_0795280.drv`): ```diff - /nix/store/8lmfrx837dyx5ai3m0lg9plzmllm0n5h-lix-2.92.0-pre20250118-0795280.drv:{out} + /nix/store/5d5idq5chw6i2cdki9b99wpwj96sh6rm-lix-2.92.0pre20250118_0795280.drv:{out} • The environments do not match: VERSION_SUFFIX=←-pre20250118-0795280←→pre20250118_0795280→ name=←lix-2.92.0-pre20250118-0795280←→lix-2.92.0pre20250118_0795280→ version=←2.92.0-pre20250118-0795280←→2.92.0pre20250118_0795280→ ``` Yes, the dash before `pre` is the only difference, so **the versions are virtually identical, which means it's not urgent by any means**, however it points at some inconsistency when using the overlays that would be nice to have resolved (so that building two versions of Lix per architecture can be avoided). The one without a dash is pulled in only by *nix-eval-jobs* (in turn pulled in by hydra), while the one with dash is used for everything else (including hydra and its perl bindings). It looks to me that this is because I am using the flake version which [in its overlay has a dependency on the package output of the nix-eval-jobs flake](https://git.lix.systems/lix-project/hydra/src/branch/main/flake.nix#L32) since there is no overlay for the lix version of nix-eval-jobs in its respective repository. However since [the lix overlay itself overrides nix-eval-jobs one already](https://git.lix.systems/lix-project/nixos-module/src/branch/main/overlay.nix#L114) I'm thinking that maybe that one reference to the package output may cause the issue in my case, and there may be a general issue of the overlay and flake output of *something* producing slightly different variations of the same package. --- For reference, [my current workaround](https://git.shell.bsocat.net/infra/commit/?h=ac8dcfe546b3941fb6e7b0eff24439ee1a3b9315) adds an overlay like the following which seems to replace the flake lix with the overlay lix for nix-eval-jobs. ```nix nix-eval-jobs = prev.nix-eval-jobs.override { inherit (final) nix; }; ``` --- I'll try to get a minimal reproducer going but given my scarce availability in the near future I wanted to report this in case it causes issues elsewhere. --- As for a possible resolution, I think making the *hydra* overlay depend on the *lix* overlay would be a good solution since you'd likely want the nix-daemon to be lix too, and the *lix* overlay would already override the nix-eval-jobs properly. This would increase the number of dependency flakes slightly, but in a more clean way IMHO (a dependency tree rather than a splat version of it accumulated in the hydra flake). Alternatively giving nix-eval-jobs its own overlay may be a solution but that seems wildly overkill given that the lix overlay can already do that. Either of these would make it so the overlays depend on the same version of lix, consistently. A second part would be to figure out where the dash-less version of the package is coming from exactly, since that one (IMHO) shouldn't exist in the firstplace (which would've hidden the above issue until someone used overlays to override lix), since the flake output and the overlay output probably should have the exact same version specification when provided with the same source.
benaryorg changed title from inconsistent/duplicate nix in hydra closure to inconsistent/duplicate lix in hydra closure 2025-01-20 13:37:00 +00:00
Member

Hmm yeah this pretty much seems like a "flake packages vs. overlays" thing which I consider a variant of the 1000 instances of nixpkgs problem which is probably my primary frustration I have with flakes.

I think making the hydra overlay depend on the lix overlay would be a good solution

First of all, a question: what is pkgs.nix on the affected system? The reason I'm asking is, I think it implicitly depends on this overlay already: the package.nix of Hydra takes nix as an argument, but I'm pretty sure that at least 2.92 and main would fail to build against any CppNix, so you already need to have an overlay in place that essentially does nix = self.lix. At least using overlayNixpkgsForThisHydra would make sure both are included.

That said, I see at least three places where we instantiate a nixpkgs when using Hydra in a NixOS configuration:

  • nix-eval-jobs (as imported by Hydra) uses the default package from Lix which is created by doing an import nixpkgs with the Lix overlay[1] and a second one in nix-eval-jobs since it just uses pkgs from flake-parts which is essentially inputs.nixpkgs.legacyPackages.
  • The NixOS configuration with Hydra has its own nixpkgs whihc consumes the overlay from the flake.

So what we'd need is to make sure that each nixpkgs has the lix overlay, correct?
The Lix one is obvious, the nix-eval-jobs one and Hydra one would need the same overlay (and same nixpkgs version) to get the same Lix which seems kinda error-prone to me (especially given that we don't have much of an influence over the nixpkgs revisions).

This is exactly the reason why I prefer overlays despite the downsides these have.

So I think what I'd like to see is:

  • nix-eval-jobs provides an overlay. I'd argue that the overlay is OK to assume that nix / lix is correctly injected by another overlay. If people don't want to do this, they can also use packages.nix-eval-jobs.
  • Downstream users may use nixosModules.hydra & nixosModules.overlayNixpkgsForThisHydra (we should probably add a nixosModules.default that imports both). Then, everything "just works", assuming all packages build on the nixpkgs rev used. We can test this to a certain degree, but if a user (rightfully) does inputs.hydra.inputs.nixpkgs.follows = "nixpkgs";, they're effectively on their own.
  • Said nixosModules.overlayNixpkgsForThisHydra includes the overlays of Lix, nix-eval-jobs & Hydra. The callPackage ./package.nix in the overlay does not override the nix-eval-jobs input.
  • Using the Hydra overlay yourself means you're taking care of the other overlays as well. We should document that explicitly.

Does that make sense?

[1]

inherit (nixpkgsFor.${system}.native) nix;
default = nix;

(the nixpkgsFor does the import)

Hmm yeah this pretty much seems like a "flake packages vs. overlays" thing which I consider a variant of the [1000 instances of nixpkgs problem](https://zimbatm.com/notes/1000-instances-of-nixpkgs) which is probably my primary frustration I have with flakes. </rant> > I think making the hydra overlay depend on the lix overlay would be a good solution First of all, a question: what is `pkgs.nix` on the affected system? The reason I'm asking is, I think it implicitly depends on this overlay already: the `package.nix` of Hydra takes `nix` as an argument, but I'm pretty sure that at least 2.92 and main would fail to build against any CppNix, so you already need to have an overlay in place that essentially does `nix = self.lix`. At least using `overlayNixpkgsForThisHydra` would make sure both are included. That said, I see at least three places where we instantiate a `nixpkgs` when using Hydra in a NixOS configuration: * `nix-eval-jobs` (as imported by Hydra) uses the default package from Lix which is created by doing an `import nixpkgs` with the Lix overlay[1] and a second one in `nix-eval-jobs` since it just uses `pkgs` from flake-parts which is essentially `inputs.nixpkgs.legacyPackages`. * The NixOS configuration with Hydra has its own `nixpkgs` whihc consumes the overlay from the flake. So what we'd need is to make sure that each nixpkgs has the `lix` overlay, correct? The Lix one is obvious, the nix-eval-jobs one and Hydra one would need the same overlay (and same nixpkgs version) to get the same Lix which seems kinda error-prone to me (especially given that we don't have much of an influence over the nixpkgs revisions). This is exactly the reason why I prefer overlays despite the downsides these have. So I think what I'd like to see is: * `nix-eval-jobs` provides an overlay. I'd argue that the overlay is OK to assume that `nix` / `lix` is correctly injected by another overlay. If people don't want to do this, they can also use `packages.nix-eval-jobs`. * Downstream users may use `nixosModules.hydra` & `nixosModules.overlayNixpkgsForThisHydra` (we should probably add a `nixosModules.default` that imports both). Then, everything "just works", assuming all packages build on the nixpkgs rev used. We can test this to a certain degree, but if a user (rightfully) does `inputs.hydra.inputs.nixpkgs.follows = "nixpkgs";`, they're effectively on their own. * Said `nixosModules.overlayNixpkgsForThisHydra` includes the overlays of Lix, nix-eval-jobs & Hydra. The `callPackage ./package.nix` in the overlay does not override the `nix-eval-jobs` input. * Using the Hydra overlay yourself means you're taking care of the other overlays as well. We should document that explicitly. Does that make sense? [1] https://git.lix.systems/lix-project/lix/src/commit/4af6b5ed9f8f2412bef5331b8e3b93f3ad305ea1/flake.nix#L422-L423 (the `nixpkgsFor` does the `import`)
ma27 closed this issue 2025-02-08 16:02:30 +00:00
Author
Contributor

Hi, I guess.
As mentioned I've been gone for several weeks and I'm still working on backlog.

Hmm yeah this pretty much seems like a "flake packages vs. overlays" thing which I consider a variant of the 1000 instances of nixpkgs problem which is probably my primary frustration I have with flakes.

I happened to not run into that issue for years, but over the past few months I have been getting increasingly frustrated in that regard too *sigh*

First of all, a question: what is pkgs.nix on the affected system?

First I reverted my hacky workaround and then I got this:

% nixosConfigurations.hydra.pkgs.hydra.buildInputs
[
# []
  «derivation /nix/store/h92yqr00saa04j9ljc47r46wbm8mahw4-lix-2.92.0pre20250221_db55ca9.drv»
# []
]

% nixosConfigurations.hydra.pkgs.nix
«derivation /nix/store/h92yqr00saa04j9ljc47r46wbm8mahw4-lix-2.92.0pre20250221_db55ca9.drv»

% nixosConfigurations.hydra.pkgs.lix
«derivation /nix/store/kf349nypjcjqmsrrv203i0p7hikhx0vp-lix-2.92.0-pre20250221-db55ca9.drv»

Note that this is using 7b3d065a13 of the lix-2.92 branch which did not receive a backport of dbb3e2a8c7 (kind of reasonable given the changed inputs).

Turns out, removing the lix overlay fixes this.
I have both the lix nixos-module as well as the lix overlay enabled here because I wanted to provide legacyPackages as an output, extended with all the used overlays, so that I can use the package set used by my machines without instantiating a NixOS configuration.
Well, turns out:

versionSuffix =
if officialRelease then
""
else
"pre${
builtins.substring 0 8 (self.lastModifiedDate or self.lastModified or "19700101")
}_${self.shortRev or "dirty"}";

let
lixVersionJson = builtins.fromJSON (builtins.readFile (lix + "/version.json"));
versionSuffix = nixpkgs.lib.optionalString (!lixVersionJson.official_release)
"-pre${builtins.substring 0 8 lix.lastModifiedDate}-${lix.shortRev or lix.dirtyShortRev}";
in

The lix overlay and the lix nixos-module have different version suffixes defined.
This means the problem here isn't hydra, but the inconsistency between the above two repositories.
As mentioned I only noticed this on my hydra instance and was a bit strapped for time so I couldn't look closer at the time.

I would like to open an issue on the respective repository to fix that inconsistency but I don't know which one is the canonically desired suffix (i.e. with or without dash).

Hi, I guess. As mentioned I've been gone for several weeks and I'm still working on backlog. > Hmm yeah this pretty much seems like a "flake packages vs. overlays" thing which I consider a variant of the 1000 instances of nixpkgs problem which is probably my primary frustration I have with flakes. I happened to not run into that issue for *years*, but over the past few months I have been getting increasingly frustrated in that regard too *\*sigh\** > First of all, a question: what is `pkgs.nix` on the affected system? First I reverted my hacky workaround and then I got this: ```console % nixosConfigurations.hydra.pkgs.hydra.buildInputs [ # […] «derivation /nix/store/h92yqr00saa04j9ljc47r46wbm8mahw4-lix-2.92.0pre20250221_db55ca9.drv» # […] ] % nixosConfigurations.hydra.pkgs.nix «derivation /nix/store/h92yqr00saa04j9ljc47r46wbm8mahw4-lix-2.92.0pre20250221_db55ca9.drv» % nixosConfigurations.hydra.pkgs.lix «derivation /nix/store/kf349nypjcjqmsrrv203i0p7hikhx0vp-lix-2.92.0-pre20250221-db55ca9.drv» ``` Note that this is using 7b3d065a13b21ec8109a250ac6148553bda52d5e of the `lix-2.92` branch which did not receive a backport of dbb3e2a8c76aba1388f4379d16d6165296dcc5b0 (kind of reasonable given the changed inputs). Turns out, removing the lix overlay fixes this. I have both the lix *nixos-module* as well as the lix overlay enabled here because I wanted to provide *legacyPackages* as an output, extended with all the used overlays, so that I can use the package set used by my machines without instantiating a NixOS configuration. Well, turns out: https://git.lix.systems/lix-project/lix/src/commit/ca89e431a31527a014bfd0d529da2a8099027a5f/flake.nix#L68-L74 https://git.lix.systems/lix-project/nixos-module/src/commit/621aae0f3cceaffa6d73a4fb0f89c08d338d729e/flake.nix#L11-L15 The lix overlay and the lix nixos-module have different version suffixes defined. This means the problem here isn't hydra, but the inconsistency between the above two repositories. As mentioned I only noticed this on my hydra instance and was a bit strapped for time so I couldn't look closer at the time. I would like to open an issue on the respective repository to fix that inconsistency but I don't know which one is the canonically desired suffix (i.e. with or without dash).
Author
Contributor

Summary

Now that I've done my research I can confirm that nix-eval-jobs was pulled from the flake by hydra, yielding nix-eval-jobs-2.92.0 (whereas the overlay has nix-eval-jobs-2.92.0-lix-df3edf3), and because nix-eval-jobs also pulled in the flake output of lix, this pulled in lix-2.92.0pre20250221_db55ca9 from the flake rather than the overlay version of lix-2.92.0-pre20250221-db55ca9.

The nix-eval-jobs path was already resolved via 7b3d065a13 (although a backport to the lix-2.92 would be appreciated).
Given that the lix module and overlay are diverging (as per code references of the last comment), that should be fixed too, ideally in a way that uses some sort of composition to ensure that there is only one single lix overlay definition – which builds the package using the overlay rather than the flake output – which is reused by the nixos-module without duplicating it as to prevent the versionSuffix (or anything else) from drifting.

As a personal side-note; my application-related flakes (i.e. not my NixOS configurations) all have their package definitions in an overlay, and the flake output merely instantiates nixpkgs with its overlay as well as overlays of dependencies where needed and returns the application directly from that nixpkgs instantiation. Whenever dependencies exist, I create two overlays, one for only the application in question, and one including overlays of dependencies, that way consumers can decide whether to pull in just a single overlay or whether to pull in each overlay individually (which may be required when there are conflicting versions).

# Summary Now that I've done my research I can confirm that *nix-eval-jobs* was pulled from the flake by hydra, yielding *nix-eval-jobs-2.92.0* (whereas the overlay has *nix-eval-jobs-2.92.0-lix-df3edf3*), and because *nix-eval-jobs* also pulled in the flake output of *lix*, this pulled in *lix-2.92.0pre20250221_db55ca9* from the flake rather than the overlay version of *lix-2.92.0-pre20250221-db55ca9*. The *nix-eval-jobs* path was already resolved via 7b3d065a13b21ec8109a250ac6148553bda52d5e (although a backport to the *lix-2.92* would be appreciated). Given that the lix module and overlay are diverging (as per code references of the last comment), that should be fixed too, ideally in a way that uses some sort of composition to ensure that there is only one single lix overlay definition – which builds the package using the overlay rather than the flake output – which is reused by the nixos-module without duplicating it as to prevent the *versionSuffix* (or anything else) from drifting. As a personal side-note; my application-related flakes (i.e. not my NixOS configurations) all have their package definitions in an overlay, and [the flake output merely instantiates nixpkgs with its overlay](https://git.shell.bsocat.net/lxddns/tree/flake.nix?h=d6f5899a4c2cf9aaa19759abf8203f0e1318b655#n16) as well as overlays of dependencies where needed and returns the application directly from that nixpkgs instantiation. Whenever dependencies exist, I create two overlays, one for only the application in question, and one including overlays of dependencies, that way consumers can decide whether to pull in just a single overlay or whether to pull in each overlay individually (which may be required when there are conflicting versions).
Member

As mentioned I've been gone for several weeks and I'm still working on backlog.

All good, thanks for the feedback here!

I would like to open an issue on the respective repository to fix that inconsistency but I don't know which one is the canonically desired suffix (i.e. with or without dash).

I was about to ask, thanks!
I'd just point out the difference, make a suggestions and see if people disagree with the direction tbh.

although a backport to the lix-2.92 would be appreciated

Isn't 7b3d065a13 a commit from the 2.92 branch? Did you perhaps mix two revs up?

Given that the lix module and overlay are diverging (as per code references of the last comment), that should be fixed too, ideally in a way that uses some sort of composition to ensure that there is only one single lix overlay definition

Agreed.

> As mentioned I've been gone for several weeks and I'm still working on backlog. All good, thanks for the feedback here! > I would like to open an issue on the respective repository to fix that inconsistency but I don't know which one is the canonically desired suffix (i.e. with or without dash). I was about to ask, thanks! I'd just point out the difference, make a suggestions and see if people disagree with the direction tbh. > although a backport to the lix-2.92 would be appreciated Isn't 7b3d065a13 a commit from the 2.92 branch? Did you perhaps mix two revs up? > Given that the lix module and overlay are diverging (as per code references of the last comment), that should be fixed too, ideally in a way that uses some sort of composition to ensure that there is only one single lix overlay definition Agreed.
Author
Contributor

I was about to ask, thanks! I'd just point out the difference, make a suggestions and see if people disagree with the direction tbh.

I just didn't know which repo, but now that I've double checked with a few different search terms, there seems to be one already: lix-project/lix#585

although a backport to the lix-2.92 would be appreciated

Isn't 7b3d065a13 a commit from the 2.92 branch? Did you perhaps mix two revs up?

100% correct, that should've been dbb3e2a8c7 of course.

I mean.… I finally got around to do a reverse flake-compat on my infra recently (cutting eval times down to like 20%-50% which is very telling of the "1000 instances of nixpkgs problem") so my infra is free of any real flake usage which (unsurprisingly) fixes this when using overlay and module composition on the consumer level which solves this for me at least, and I'm probably one of like 3 people running a hydra instance anyway so.… backport at your leisure, I don't depend on that change ^^

And in regards to touching the lix nixos-module ⇔ lix overlay situation, the mentioned issue already concerns itself with aligning the version numbers so if anything one could bring it up there as a side-note (which the automatic forgejo cross-reference should take care of).

> I was about to ask, thanks! I'd just point out the difference, make a suggestions and see if people disagree with the direction tbh. I just didn't know which repo, but now that I've double checked with a few different search terms, there seems to be one already: https://git.lix.systems/lix-project/lix/issues/585 > > although a backport to the lix-2.92 would be appreciated > > Isn't 7b3d065a13 a commit from the 2.92 branch? Did you perhaps mix two revs up? 100% correct, that should've been dbb3e2a8c76aba1388f4379d16d6165296dcc5b0 of course. I mean.… I finally got around to do a reverse flake-compat on my infra recently (cutting eval times down to like 20%-50% which is very telling of the "1000 instances of nixpkgs problem") so my infra is free of any real flake usage which (unsurprisingly) fixes this when using overlay and module composition on the consumer level which solves this for me at least, and I'm probably one of like 3 people running a hydra instance anyway so.… backport at your leisure, I don't depend on that change ^^ And in regards to touching the lix nixos-module ⇔ lix overlay situation, the mentioned issue already concerns itself with aligning the version numbers so if anything one could bring it up there as a side-note (which the automatic forgejo cross-reference should take care of).
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/hydra#18
No description provided.