Flake tarball immutable links are broken #358

Closed
opened 2024-05-28 09:52:56 +00:00 by getchoo · 15 comments

Describe the bug

The Link: <flakeref>; rel="immutable" HTTP header described in the tarball-fetcher docs doesn't seem to be respected, unlike upstream Nix

Steps To Reproduce

  1. Run nix flake show "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz" (or any URL that eventually redirects to the aforementioned Link HTTP header)
  2. Note the URL resolved

Expected behavior

The value of the Link HTTP header will be used in place of the explicit URL

nix --version output

nix (Lix, like Nix) 2.90.0pre20240527_0b91a4b

Additional context

This doesn't occur with upstream Nix. Note the resolved URL of each

$ nix run "git+https://git.lix.systems/lix-project/lix?rev=0b91a4b0ec79c27ee36d8a7e2afd7737cb825b65" -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz"
fetching tarball input 'https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz'
evaluating ''...
https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D
evaluating 'lib'...
├───lib: unknown
evaluating 'templates'...
└───templates
evaluating 'templates.check-utils'...
    ├───check-utils: template: A flake with tests
evaluating 'templates.default'...
    ├───default: template: A flake using flake-utils.lib.eachDefaultSystem
evaluating 'templates.each-system'...
    ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem
evaluating 'templates.simple-flake'...
    └───simple-flake: template: A flake using flake-utils.lib.simpleFlake

$ nix run "github:NixOS/nix/ef96a58ed73e2652fa27e1bb58e6f5c65e13cfa7" -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz"
warning: unknown experimental feature 'repl-flake'
https://api.flakehub.com/f/pinned/numtide/flake-utils/0.1.92%2Brev-b1d9ab70662946ef0850d488da1c9019f3a9752a/018e2ca5-e5a2-7f80-9261-445a8cecd4d7/source.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D
├───lib: unknown
└───templates
    ├───check-utils: template: A flake with tests
    ├───default: template: A flake using flake-utils.lib.eachDefaultSystem
    ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem
    └───simple-flake: template: A flake using flake-utils.lib.simpleFlake
## Describe the bug The `Link: <flakeref>; rel="immutable"` HTTP header described in the [tarball-fetcher](https://git.lix.systems/lix-project/lix/src/commit/0b91a4b0ec79c27ee36d8a7e2afd7737cb825b65/doc/manual/src/protocols/tarball-fetcher.md) docs doesn't seem to be respected, unlike upstream Nix ## Steps To Reproduce 1. Run `nix flake show "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz"` (or any URL that eventually redirects to the aforementioned `Link` HTTP header) 2. Note the URL resolved ## Expected behavior The value of the `Link` HTTP header will be used in place of the explicit URL ## `nix --version` output ``` nix (Lix, like Nix) 2.90.0pre20240527_0b91a4b ``` ## Additional context This doesn't occur with upstream Nix. Note the resolved URL of each ```sh $ nix run "git+https://git.lix.systems/lix-project/lix?rev=0b91a4b0ec79c27ee36d8a7e2afd7737cb825b65" -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz" fetching tarball input 'https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz' evaluating ''... https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D evaluating 'lib'... ├───lib: unknown evaluating 'templates'... └───templates evaluating 'templates.check-utils'... ├───check-utils: template: A flake with tests evaluating 'templates.default'... ├───default: template: A flake using flake-utils.lib.eachDefaultSystem evaluating 'templates.each-system'... ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem evaluating 'templates.simple-flake'... └───simple-flake: template: A flake using flake-utils.lib.simpleFlake $ nix run "github:NixOS/nix/ef96a58ed73e2652fa27e1bb58e6f5c65e13cfa7" -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz" warning: unknown experimental feature 'repl-flake' https://api.flakehub.com/f/pinned/numtide/flake-utils/0.1.92%2Brev-b1d9ab70662946ef0850d488da1c9019f3a9752a/018e2ca5-e5a2-7f80-9261-445a8cecd4d7/source.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D ├───lib: unknown └───templates ├───check-utils: template: A flake with tests ├───default: template: A flake using flake-utils.lib.eachDefaultSystem ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem └───simple-flake: template: A flake using flake-utils.lib.simpleFlake ```
getchoo added the
bug
label 2024-05-28 09:52:56 +00:00
Owner

This is very interesting, because this functionality does work at some level. See:

~ » nix --version
nix (Lix, like Nix) 2.90.0-lixpre20240520-992c63f

~ » nix flake metadata --refresh https://git.lix.systems/lix-project/lix/archive/main.tar.gz
Resolved URL:  https://git.lix.systems/lix-project/lix/archive/main.tar.gz
Locked URL:    https://git.lix.systems/api/v1/repos/lix-project/lix/archive/71b32bb87cd48dbbd672c8ca6b041ed36f3bae11.tar.gz?narHash=sha256-NH7keFO8/FpGZp5OLmmP5wPGRYZ%2BMtWfdZNa1HPmqUA%3D
Description:   The purely functional package manager
Path:          /nix/store/rx615n42hl3ykkypldx574y0cnm30vh9-source
Revision:      71b32bb87cd48dbbd672c8ca6b041ed36f3bae11
Last modified: 2024-05-28 04:01:37
Inputs:
├───flake-compat: github:edolstra/flake-compat/0f9255e01c2351cc7d116c072cb317785dd33b33
├───nixpkgs: github:NixOS/nixpkgs/0c592f9a288bdf764b6f24c757277c0e49757a46
├───nixpkgs-regression: github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2
└───pre-commit-hooks: github:cachix/git-hooks.nix/e35aed5fda3cc79f88ed7f1795021e559582093a

I think there was some work on the URL handling that might be related, @ma27 may know?

This is a regression since 2.18. Marking as release blocker.

This is very interesting, because this functionality *does* work at some level. See: ``` ~ » nix --version nix (Lix, like Nix) 2.90.0-lixpre20240520-992c63f ~ » nix flake metadata --refresh https://git.lix.systems/lix-project/lix/archive/main.tar.gz Resolved URL: https://git.lix.systems/lix-project/lix/archive/main.tar.gz Locked URL: https://git.lix.systems/api/v1/repos/lix-project/lix/archive/71b32bb87cd48dbbd672c8ca6b041ed36f3bae11.tar.gz?narHash=sha256-NH7keFO8/FpGZp5OLmmP5wPGRYZ%2BMtWfdZNa1HPmqUA%3D Description: The purely functional package manager Path: /nix/store/rx615n42hl3ykkypldx574y0cnm30vh9-source Revision: 71b32bb87cd48dbbd672c8ca6b041ed36f3bae11 Last modified: 2024-05-28 04:01:37 Inputs: ├───flake-compat: github:edolstra/flake-compat/0f9255e01c2351cc7d116c072cb317785dd33b33 ├───nixpkgs: github:NixOS/nixpkgs/0c592f9a288bdf764b6f24c757277c0e49757a46 ├───nixpkgs-regression: github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2 └───pre-commit-hooks: github:cachix/git-hooks.nix/e35aed5fda3cc79f88ed7f1795021e559582093a ``` I think there was some work on the URL handling that might be related, @ma27 may know? This is a regression since 2.18. Marking as release blocker.
jade added this to the v2.90 milestone 2024-05-28 18:12:49 +00:00
jade added the
release-blocker
label 2024-05-28 18:12:53 +00:00
jade self-assigned this 2024-05-28 18:23:55 +00:00
Owner
dev/clipper # nix run nixpkgs#nixVersions.nix_2_18 -- flake metadata --refresh 'https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz'

Resolved URL:  https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz
Locked URL:    https://api.flakehub.com/f/pinned/numtide/flake-utils/0.1.92%2Brev-b1d9ab70662946ef0850d488da1c9019f3a9752a/018e2ca5-e5a2-7f80-9261-445a8cecd4d7/source.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D
Description:   Pure Nix flake utility functions
Path:          /nix/store/na7sykizsgkzh9i3wc8m8pz5xfqib2rv-source
Revision:      b1d9ab70662946ef0850d488da1c9019f3a9752a
Revisions:     92
Last modified: 2024-03-11 01:33:50
Inputs:
└───systems: github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e
dev/clipper # nix flake metadata --refresh 'https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz' 
Resolved URL:  https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz
Locked URL:    https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D
Description:   Pure Nix flake utility functions
Path:          /nix/store/na7sykizsgkzh9i3wc8m8pz5xfqib2rv-source
Last modified: 2024-03-11 01:33:50
Inputs:
└───systems: github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e

Confirmed the bug is a regression since Nix 2.18. We can go bisect it later today.

``` dev/clipper # nix run nixpkgs#nixVersions.nix_2_18 -- flake metadata --refresh 'https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz' Resolved URL: https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz Locked URL: https://api.flakehub.com/f/pinned/numtide/flake-utils/0.1.92%2Brev-b1d9ab70662946ef0850d488da1c9019f3a9752a/018e2ca5-e5a2-7f80-9261-445a8cecd4d7/source.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D Description: Pure Nix flake utility functions Path: /nix/store/na7sykizsgkzh9i3wc8m8pz5xfqib2rv-source Revision: b1d9ab70662946ef0850d488da1c9019f3a9752a Revisions: 92 Last modified: 2024-03-11 01:33:50 Inputs: └───systems: github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e dev/clipper # nix flake metadata --refresh 'https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz' Resolved URL: https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz Locked URL: https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D Description: Pure Nix flake utility functions Path: /nix/store/na7sykizsgkzh9i3wc8m8pz5xfqib2rv-source Last modified: 2024-03-11 01:33:50 Inputs: └───systems: github:nix-systems/default/da67096a3b9bf56a91d16901293e51ba5b49a27e ``` Confirmed the bug is a regression since Nix 2.18. We can go bisect it later today.

Hi, I managed to bisect to this commit d0390b5cf2

Hi, I managed to bisect to this commit d0390b5cf2d232febaa89aa6d8b07c547513a460

Reverting said commit seems to fix the problem. At least for me.

$ nix run . -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz"
https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D
├───lib: unknown
└───templates
    ├───check-utils: template: A flake with tests
    ├───default: template: A flake using flake-utils.lib.eachDefaultSystem
    ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem
    └───simple-flake: template: A flake using flake-utils.lib.simpleFlake
Reverting said commit seems to fix the problem. At least for me. ```bash $ nix run . -- flake show --refresh "https://flakehub.com/f/numtide/flake-utils/0.1.*.tar.gz" https://flakehub.com/f/numtide/flake-utils/0.1.%2A.tar.gz?narHash=sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ%3D ├───lib: unknown └───templates ├───check-utils: template: A flake with tests ├───default: template: A flake using flake-utils.lib.eachDefaultSystem ├───each-system: template: A flake using flake-utils.lib.eachDefaultSystem └───simple-flake: template: A flake using flake-utils.lib.simpleFlake ```
Owner

this code base is absolutely haunted. holy shit

this code base is absolutely *haunted*. holy shit
Owner

that revert doesn't seem to fix it for us? what on earth

that revert doesn't seem to fix it for us? what on earth
Member

This issue is very cursed and related to filetransfer.cc refactorings, most likely 86bfede948. In the CppNix version, Link headers are taken into account at any point into the redirect chain. In the Lix version, only Link headers in the last HTTP request are considered, because they're read by getHeader at the end of the request instead of from the streamed headers callback.

cc @pennae

This issue is very cursed and related to filetransfer.cc refactorings, most likely 86bfede948db03181cdf2989ee33b96905366f1b. In the CppNix version, `Link` headers are taken into account at any point into the redirect chain. In the Lix version, only `Link` headers in the last HTTP request are considered, because they're read by `getHeader` at the end of the request instead of from the streamed headers callback. cc @pennae
Member

Note: I'm not convinced that this is how Link should work, and it would also be perfectly acceptable to consider this a bug and ask Flakehub to fix it server side, imo. Or the documentation should be improved along the way.

Note: I'm not convinced that this is *how* `Link` should work, and it would also be perfectly acceptable to consider this a bug and ask Flakehub to fix it server side, imo. Or the documentation should be improved along the way.
Member

And unlike content-encoding and accept-ranges, this can't be moved to writeCallback because most redirects won't have a body and writeCallback is never called for them. In case someone was thinking of trying this fix... (I just did.)

And unlike `content-encoding` and `accept-ranges`, this can't be moved to `writeCallback` because most redirects won't have a body and `writeCallback` is never called for them. In case someone was thinking of trying this fix... (I just did.)
Owner

Note: I'm not convinced that this is how Link should work, and it would also be perfectly acceptable to consider this a bug and ask Flakehub to fix it server side, imo. Or the documentation should be improved along the way.

We have to accept ones in the last redirect because the last redirect might be to s3 or something that can't add headers, as is the case in practice here. That said, do we even have coverage on this lmao.

> Note: I'm not convinced that this is *how* `Link` should work, and it would also be perfectly acceptable to consider this a bug and ask Flakehub to fix it server side, imo. Or the documentation should be improved along the way. We have to accept ones in the last *redirect* because the last redirect might be to s3 or something that can't add headers, as is the case in practice here. That said, do we even have coverage on this lmao.
Member

Someone who's willing to interact with the CppNix team should go ask them to better define their protocol, because right now it's a mess. There's nothing in the documentation which specifies headers should be read from intermediate redirects (it's definitely not the usual way HTTP APIs work...). If intermediate redirects are supposed to be able to set a Link header then what happens when there are multiple ones in the chain?

Right now the CppNix implementation does "take the latest value found in the chain" (incl. in the case where multiple Link headers are returned in the same request). It's also not really conforming to the HTTP standards since Link can have multiple comma-separated values, which CppNix doesn't support... So many problems :/

Someone who's willing to interact with the CppNix team should go ask them to better define their protocol, because right now it's a mess. There's nothing in the documentation which specifies headers should be read from intermediate redirects (it's definitely not the usual way HTTP APIs work...). If intermediate redirects are supposed to be able to set a `Link` header then what happens when there are multiple ones in the chain? Right now the CppNix implementation does "take the latest value found in the chain" (incl. in the case where multiple `Link` headers are returned in the same request). It's also not really conforming to the HTTP standards since `Link` can have multiple comma-separated values, which CppNix doesn't support... So many problems :/
jade changed title from Flake tarballs are broken to Flake tarball immutable links are broken 2024-05-29 06:53:33 +00:00
Owner

I mean, we can define the FlakeHub protocolimmutable tarball link protocol ourselves, but it's genuinely a very bad protocol to begin with. I think the best thing for interop is to collect them from all intermediate redirects since it would be very surprising behaviour if only the last redirect's Link header were considered (since the s3-like would then break it by putting in a redirect to itself, e.g.) :/

It is a thing that exists that was shipped and is useful, but it definitely has regrettable design choices. Also, like, why is it rel="immutable" but it's a flakeref?! No reasonable implementer who isn't nix would be likely to use it if the defined interface is "it's a flakeref" tbh. We should probably also document some subset of fields that happens to look like a flakeref that we do consider in it, and ignore all other ones, to make it not a flake thing anymore at least.

I mean, we can define the ~~FlakeHub protocol~~immutable tarball link protocol ourselves, but it's genuinely a very bad protocol to begin with. I think the best thing for interop is to collect them from all intermediate redirects since it would be very surprising behaviour if only the last redirect's Link header were considered (since the s3-like would then break it by putting in a redirect to itself, e.g.) :/ It *is* a thing that exists that was shipped and is useful, but it definitely has regrettable design choices. Also, like, why is it `rel="immutable"` but it's a *flakeref*?! No reasonable implementer who isn't nix would be likely to use it if the defined interface is "it's a flakeref" tbh. We should probably also document some subset of fields that happens to look like a flakeref that we *do* consider in it, and ignore all other ones, to make it *not a flake thing anymore* at least.
Owner

horrors suggests this could be instead implemented with spec compliant http redirects: 0.6.*.tar.gz --302-> 0.6.2.tar.gz --301-> some-hash.tar.gz --302-> horrible-s3-thing.tar.gz

We would then store the last 301 in the chain. Obviously we should actually fix the present flakehub regression tho.

horrors suggests this could be instead implemented with spec compliant http redirects: `0.6.*.tar.gz --302-> 0.6.2.tar.gz --301-> some-hash.tar.gz --302-> horrible-s3-thing.tar.gz` We would then store the last 301 in the chain. Obviously we should actually fix the present flakehub regression tho.
Owner

Btw: spec for Link header if we want to write a spec compliant parser: https://httpwg.org/specs/rfc8288.html#parse-fv

Btw: spec for Link header if we want to write a spec compliant parser: https://httpwg.org/specs/rfc8288.html#parse-fv
jade removed their assignment 2024-05-29 07:26:58 +00:00
jade added the
Area/flakes
label 2024-05-29 07:47:29 +00:00
pennae self-assigned this 2024-05-29 17:22:29 +00:00

Hi 👋

Ryan kindly brought this thread to our attention in the meeting today; much appreciated.

I was the only one (not from DetSys) who somewhat reviewed this feature before it was merged rather soon. I had let that slip because I figured it'd be ok to return to it when users have feedback, which I guess is now. (Didn't expect Lix, but hey)

this code base is absolutely haunted. holy shit

I know I know.
I hope I can help a bit. Occasionally, something is going to be weird for a reason. I'd give it a 50/50 in general.

In the CppNix version, Link headers are taken into account at any point into the redirect chain.

cc @pennae

This is by design; the idea being that an API can add this information in the redirect response, so that the actual bulk data can be handled by a dumb CDN, bucket, etc.

I agree with @delroth that this is generally not how HTTP clients work, but I would consider this to be a case of being liberal in what we accept.

The 302 -> 301 -> 302 sequence is interesting. That does look like it fits the bill, but it feels less intentional than the Link-based syntax. Would any such sequence of redirects just work? That might not be desirable. It also seems to require more roundtrips, which is a bit unfortunate.

I'm also reading that our Link parser isn't compliant. That definitely seems like a bug to be fixed.

why is it rel="immutable" but it's a flakeref?! No reasonable implementer who isn't nix would be likely to use it if the defined interface is "it's a flakeref" tbh.

I thought I originally proposed in my reviews for it to be a normal HTTP url with extra Link parameters (ie at the level of rel) rather than query parameters, but I can't find that in the thread.
We do enforce that the flakeref is an http(s) one, for what it's worth.

What do you think of having a syntax like this instead? Link <http://example.com/normal/url/foo.tar.gz>; rel="immutable"; rev="abcdef" Or might the original X-Nix-Rev` header be better?

Hi 👋 Ryan kindly brought this thread to our attention in the meeting today; much appreciated. I was the only one (not from DetSys) who somewhat reviewed this feature before it was merged rather soon. I had let that slip because I figured it'd be ok to return to it when users have feedback, which I guess is now. (Didn't expect Lix, but hey) > this code base is absolutely *haunted*. holy shit I know I know. I hope I can help a bit. Occasionally, something is going to be weird for a reason. I'd give it a 50/50 in general. > In the CppNix version, `Link` headers are taken into account at any point into the redirect chain. > > cc @pennae This is by design; the idea being that an API can add this information in the redirect response, so that the actual bulk data can be handled by a dumb CDN, bucket, etc. I agree with @delroth that this is generally not how HTTP clients work, but I would consider this to be a case of being liberal in what we accept. The 302 -> 301 -> 302 sequence is interesting. That does look like it fits the bill, but it feels less intentional than the `Link`-based syntax. Would any such sequence of redirects just work? That might not be desirable. It also seems to require more roundtrips, which is a bit unfortunate. I'm also reading that our `Link` parser isn't compliant. That definitely seems like a bug to be fixed. > why is it `rel="immutable"` but it's a *flakeref*?! No reasonable implementer who isn't nix would be likely to use it if the defined interface is "it's a flakeref" tbh. I thought I originally proposed in my reviews for it to be a normal HTTP url with extra `Link` parameters (ie at the level of `rel`) rather than query parameters, but I can't find that in the thread. We do enforce that the flakeref is an http(s) one, for what it's worth. What do you think of having a syntax like this instead? `Link <http://example.com/normal/url/foo.tar.gz>; rel="immutable"; rev="abcdef" Or might the original `X-Nix-Rev` header be better?
Sign in to join this conversation.
No milestone
No project
No assignees
6 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#358
No description provided.