Stabilize and default to auto-allocate-uids

jade commented

2024-06-11 02:22:10 +00:00

Owner

I don't remember what the blockers were on this and if it is particularly broken, but it would be useful to not need nixbld users on Linux. We should figure out why we aren't doing it and do it at some point.

@puck was talking about doing this a while ago, and I forgot the context.

I don't remember what the blockers were on this and if it is particularly broken, but it would be useful to not need `nixbld` users on Linux. We should figure out why we aren't doing it and do it at some point. @puck was talking about doing this a while ago, and I forgot the context.

👍 1

jade added the

stability

OS/Linux

labels 2024-06-11 02:22:10 +00:00

jade added this to the post-release milestone 2024-06-11 02:22:19 +00:00

jade removed the

OS/Linux

label 2024-06-11 02:23:57 +00:00

alois31 commented

2024-06-14 17:57:08 +00:00

Member

While this feature might be slightly useful for the two things it enables (the auto-allocate-uids setting, and the uid-range system feature), I do not think it is suitable for stabilisation in its current state. The following describes some of the brokenness on Linux (see https://systemd.io/UIDS-GIDS/ for a good reference), in roughly decreasing order of severity:

The locking logic is racy. While it does check whether the start UID of the range is allocated (which is in fact a common practice), it is only acquired with respect to other Nix builds. Locking the UID properly would probably require the creation of a custom NSS module (yuck).
It is not checked that the start-id is aligned. Therefore, even if locking were done properly, the race against other implementations that only check the start of the UID range would persist.
The GID is not checked at all: it is simply assumed that whenever a UID is free, the same GID is too. (While questionable, this assumption might however actually be not that incorrect in practice.)
uid-range has weird failure modes in some sandbox modes (silently ignored in the multi-user sandbox without user namespacing, "error: feature 'uid-range' requires the setting 'auto-allocate-uids' to be enabled" in single-user mode).
The code handling uid-range has quite a bunch of sketchy conditionals in the sandbox setup code that I would not be confident in stabilising the behavior of. (But the non-uid-range behavior is pretty much just as bad, and some changes here might actually be feasible even at a later moment.)

Of course, this was not talking about Darwin at all, because I do not use or know about that platform.

While this feature might be slightly useful for the two things it enables (the `auto-allocate-uids` setting, and the `uid-range` system feature), I do not think it is suitable for stabilisation in its current state. The following describes some of the brokenness on Linux (see https://systemd.io/UIDS-GIDS/ for a good reference), in roughly decreasing order of severity: 1. The locking logic is racy. While it does check whether the start UID of the range is allocated (which is in fact a common practice), it is only acquired with respect to other Nix builds. Locking the UID properly would probably require the creation of a custom NSS module (yuck). 2. It is not checked that the `start-id` is aligned. Therefore, even if locking were done properly, the race against other implementations that only check the start of the UID range would persist. 3. The GID is not checked at all: it is simply assumed that whenever a UID is free, the same GID is too. (While questionable, this assumption might however actually be not that incorrect in practice.) 4. `uid-range` has weird failure modes in some sandbox modes (silently ignored in the multi-user sandbox without user namespacing, "error: feature 'uid-range' requires the setting 'auto-allocate-uids' to be enabled" in single-user mode). 5. The code handling `uid-range` has quite a bunch of sketchy conditionals in the sandbox setup code that I would not be confident in stabilising the behavior of. (But the non-uid-range behavior is pretty much just as bad, and some changes here might actually be feasible even at a later moment.) Of course, this was not talking about Darwin at all, because I do not use or know about that platform.

pennae commented

2024-06-14 18:16:32 +00:00

Owner

on linux we could integrate with systemd to reuse the systemd uid/gid mapper. lix could dynamically create json user records and have systemd assign uids/gids automatically.

on linux we could integrate with systemd to reuse the systemd uid/gid mapper. lix could dynamically create [json user records](https://systemd.io/USER_RECORD/) and have systemd assign uids/gids automatically.

jade commented

2024-06-14 19:38:19 +00:00

Author

Owner

Yeah I should be clear: I expect that, just like the cgroups feature #77, it is completely broken by design by virtue of not having looked at what systemd expects things to do. So it would possibly need a rewrite before stabilization.

Yeah I should be clear: I expect that, just like the cgroups feature https://git.lix.systems/lix-project/lix/issues/77, it is completely broken by design by virtue of not having looked at what systemd expects things to do. So it would possibly need a rewrite before stabilization.

alois31 commented

2024-06-16 18:30:05 +00:00

Member

on linux we could integrate with systemd to reuse the systemd uid/gid mapper. lix could dynamically create json user records and have systemd assign uids/gids automatically.

Do you have a link to documentation how one can let systemd allocate the UIDs automatically? Because running a (one!) daemon implementing the varlink interface is pretty close to the NSS module in "yuck" level.

> on linux we could integrate with systemd to reuse the systemd uid/gid mapper. lix could dynamically create [json user records](https://systemd.io/USER_RECORD/) and have systemd assign uids/gids automatically. Do you have a link to documentation how one can let systemd allocate the UIDs automatically? Because running a (one!) daemon implementing the varlink interface is pretty close to the NSS module in "yuck" level.

jade commented

2024-06-16 19:15:51 +00:00

Author

Owner

Hmmm, so reading through the documentation I am gathering the impression that there's no mechanism to actually allocate uids. How does dynamic users work then, anyway? I mean you could just make up a randomized uid in the upper 32 bits, look it up, then register it, i suppose, which would make it very hard to accidentally collide, but this doesn't feel very good.

pennae commented

2024-06-16 19:33:57 +00:00

Owner

oh, that's the funky bit: we can just systemd-run -p DynamicUser=true -p RemainAfterExit=true -p User=nixbld9001 true and receive a dynamic user id that's available through systemd-userdbd, and thus NSS. when we're done with the uid we stop the unit that generated and the id goes away. stick all of those into the scope that contains the nix daemon and add a dependency to stop ther user creation services when the daemon shuts down ...

more research obviously needed, but this seems like the cleanest way to actually do this (and would give us other nice features like perhaps being able to use systemd to run some things, and likewise using systemd to connect to running builds)

edit to clarify: we'd use this to fork complete per-build-user orchestrators in the daemon scope/slice. the daemon would then socket-activate that orchestrator to both acquire a uid/gid and some amount of systemd-level isolation, but otherwise act a lot like the daemon post-fork now (except reusable). other systems still need some way to block off a list of uids/gids for the builders and another sandboxing mechanism, but the per-build-user orchestration bits in general are pretty generic.

oh, that's the funky bit: we can just `systemd-run -p DynamicUser=true -p RemainAfterExit=true -p User=nixbld9001 true` and receive a dynamic user id that's available through systemd-userdbd, and thus NSS. when we're done with the uid we stop the unit that generated and the id goes away. stick all of those into the scope that contains the nix daemon and add a dependency to stop ther user creation services when the daemon shuts down ... more research obviously needed, but this seems like the cleanest way to actually do this (and would give us other nice features like perhaps being able to use systemd to run some things, and likewise using systemd to connect to running builds) ***edit to clarify***: we'd use this to fork *complete per-build-user orchestrators* in the daemon scope/slice. the daemon would then socket-activate that orchestrator to both acquire a uid/gid and some amount of systemd-level isolation, but otherwise act a lot like the daemon post-fork now (except reusable). other systems still need some way to block off a list of uids/gids for the builders and another sandboxing mechanism, but the per-build-user orchestration bits in general are pretty generic.

jade commented

2024-06-17 08:50:49 +00:00

Author

Owner

https://github.com/systemd/systemd/blob/main/src/core/dynamic-user.c#L183

it seems that systemd just randomly picks one, approximately.

https://github.com/systemd/systemd/blob/main/src/core/dynamic-user.c#L183 it seems that systemd just randomly picks one, approximately.

Stabilize and default to auto-allocate-uids #387