Allow ipc-sysv* in the Darwin sandbox #623
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
5 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#623
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is your feature request related to a problem? Please describe.
The sandbox profile in CppNix was updated in https://github.com/NixOS/nix/pull/10878 to allow the
shmget
syscall, which is used by the postgresql tests.Describe the solution you'd like
It would be nice if
372d5a441e
could be backported to Lix:Describe alternatives you've considered
The postgresql derivation in nixpkgs could be updated to set
sandboxProfile
(though this would be redundant for CppNix):Additional context
I ran into this while debugging a build failure (not specific to Lix) in nixpkgs: https://github.com/NixOS/nixpkgs/issues/371242
I don't know if this should be trivially just bypassed in the sandbox: sysv IPC API is effectively deprecated on macOS and should probably not be used to begin with https://forums.developer.apple.com/forums/thread/719897
I looked into why postgres was using APIs which are jank on every POSIX platform (POSIX shared memory is preferred on basically every other platform as well) and it seems that it is because they need it for some kind of synchronization purposes that cannot be avoided:
102de3be73/src/backend/port/sysv_shmem.c (L43-L49)
However, all of this together makes me think that postgres really should be the special case here, and we shouldn't allow derivations to do things that are 1. uncommon, 2. liable to be incompatible with e.g. app sandbox, 3. which allow arbitrary communication with the rest of the system, without the derivation specifying it needs them. However, I am not a macOS maintainer and I might be missing some details here.
I do know that
sandboxProfile
is only permitted whensandbox
isrelaxed
innix.conf
, which is a non-ideal restriction for something which probably be stuck in there forever, but weh. idk.@lilyball may know more/have other thoughts
That makes sense. AFAICT sysv IPC was added to CppNix primarily to support running the postgres tests. The postgres tests were recently disabled so it's likely unnecessary (no other derivations need it, AFAIK).
I think we can just close this wontfix then, since it doesn't cause any other known problems (I think?).
In fact we were able to enable them in August thanks to the sandbox change in CppNix. We only disabled them recently again, because we hit #691 and didn't understand it fully, yet.
I'd really like to enable PostgreSQL tests again on Darwin in nixpkgs. It's not only the tests for PostgreSQL itself, but also a lot of packages depending on it and running some tests against it in their checkPhase. Currently, almost all of them are special cased for darwin - and then they disable all their tests.
This really creates a big gap of testing on darwin. I'm sure @ma27 agrees with me here: As the nixpkgs' postgres maintainers, we would like to provide first class support for MacOS as well.
Yeah, not good. Additionally, the knowledge would have to be spread over all kinds of places in nixpkgs. Hard to maintain, really.
Of course, now that we know about #691, it makes sense to only do the backport if we can fix that issue as well. The status quo, for lix, is that it gives a hard error (permission denied), which is predictable. The status quo for CppNix is that it fails in very unpredictable ways, because of that "leak".
@wolfgangwalther wrote in #623 (comment):
In fact, we only disabled the tests for x86_64-darwin, not for aarch64-darwin. So currently lix can not build nixpkgs' postgresql derivation on aarch64-darwin.
Cannot with the sandbox enabled, right? Which isn’t the default.
I’d like it to be, and it would be nice to support running Postgres in derivations, but I concur that unrestricted SysV IPC in derivations is a step in the wrong direction, and I think
sandboxProfile
is kind of silly and wantsandbox = true
to be the default. Not sure whether it can be restricted usefully or not, and if not, we’d have to figure out something else to do about Postgres. Being able to disable the use of shared memory, for example?Technically that's correct, but you will still hit #691 AFAICT, even with the sandbox disabled. Can't imagine why these builds would have failed on darwin in hydra otherwise (where I understand the sandbox is off, correct?).
Note that this not only affects some tests, but we also have some extensions that need
postgres
to run at build time. That's everything that depends oncargo-pgrx
. That's a few extensions by now, also some hip vector-related stuff.We have marked all of these as broken, because of this issue, but also #691.
I can't judge that, not enough insights into that.
Agreed.
PostgreSQL will always use a tiny amount of this, see this comment in the source code:
2c53dec7f4/src/backend/port/sysv_shmem.c (L42-L67)
Could this be changed? Potentially, yes. But PostgreSQL would need to be patched / a change upstreamed.
Right. I think there are three possible worlds:
It turns out that the undocumented macOS sandbox is powerful enough to restrict System V IPC to be isolated to a single build user. We can allow System V IPC without permitting communication between derivations, other derivations, and the outside world.
We accept that communication between derivations, other derivations, and the outside world is an acceptable part of the threat model on macOS.
We adjust Postgres to not rely on System V IPC for tests in Nix derivations, whether upstream or downstream. For instance, a fallback code path if System V IPC is not permitted, an option to disable it, or porting it to use something else on macOS or in general.
The ideal world is (1), but although the macOS sandbox has more hidden functionality than you might expect, I would not bet on it in this case. Nothing in
/System/Library/Sandbox/Profiles
even mentionssysv
, and although there is a sandbox filter calledipc-posix-name
, there does not appear to be anything relevant for System V IPC.Even on Linux, we cannot just permit everything programs might want to do inside the sandbox. On macOS, we have no namespacing mechanism, so by necessity we must be stricter to obtain comparable (or even weaker) properties. It is good to reduce that as much as possible, but we can’t eliminate it.
considering the plethora of macos-related problems we have with sandboxing and that the sandbox is both disabled by default (until now) and it is completely impossible to achieve comparable-to-linux sandboxing on macos anyway we personally feel that allowing communication between drvs is a lesser evil. enabling any sandboxing at all by default is still better than the status quo, and if we do figure out a way to restrict sysv ipc in the future we can still add that then.
(recall that the entire concept of using a nix implementation on macos hinges on the assumption that string replacement of ascii store paths in derivation outputs is not only legal, but will always work!)
That sounds like a 👍 to re-open this issue! ;)
we leave that to the darwin maintainers here. we do agree that a leaky sandbox is quite useless, but so is no sandbox, or a sandbox you need to be able to just opt out of and thus cannot ever trust to actually do anything useful. the point is more that we don't have to make the leap to feature parity with linux at once (if it's possible at all, which we strongly doubt). personally we quite honestly could not care less about darwin, we just think that focusing on getting something perfect that fundamentally cannot be is hurting more than it helps.
can we just expect all sandbox users to disable any tests that use sysv ipc? perfect, let's not allow sysv ipc then. but if there a good reason we can't expect this then asking them to disable the sandbox is perhaps a bit much?
@pennae wrote in #623 (comment):
I agree that the current state of the sandbox is bad, which is why I am trying to work on improving it. But I think that the fact that the sandbox isn’t the default is precisely what makes it less pressing to weaken it? Things only break if you opt in (putting aside the lack of System V IPC clean‐up for now, which is orthogonal).
I don’t think this is true. The macOS sandboxing API offers strong isolation properties and I believe it can be a hard security boundary with enough work. The only question is what compatibility sacrifices need to be made to do so. The current macOS sandbox profile is far away from the Pareto frontier: it is both unnecessarily incompatible and unnecessarily leaky.
I would like to work on improving it on both of those fronts, such that it offers good security properties and is compatible enough to enable by default, but the leakier you make it, the more it makes sense to just have a very basic sandbox to close off some common reproducibility bugs and give up on any kind of isolation and security guarantees. A basic sandbox along those lines is a good idea and could be enabled by default on a pretty short timeline, but I don’t think that we should make it
sandbox = true
. The currentsandbox = true
has far larger compatibility issues preventing it from being enabled by default than a System V IPC issue that only affects Postgres. Introducing a lax sandbox that offers no meaningful security properties to enable by default is basically entirely disjoint from improvingsandbox = true
; the two profiles would look very different.This is just about
--rebuild
and--check
, right? I don’t think those are core to the entire concept of Nix. The very fact that they’ve been broken onaarch64-darwin
for years and barely anyone has noticed shows that they’re fairly marginal. I don’t really see what it has to do with the sandbox, though.then by all means! we don't want to stop any improvements happening, we just want to make sure we aren't stuck with a bad situation forever because the "good" solution is unattainable with the resources we have. a basic sandbox we improve on over time is a perfectly valid direction to take for that
not only, but also. this affects any kind of build of derivations for which some outpaths exist already, including e.g. realizing a second output of a derivation for which the first output was pulled from a remote builder. it is impossible to avoid all situations in which this assumption is made without requiring that all derivations always have either all outputs realized, or none of them.
Fair point… I suppose in the vast majority of cases the reason you have only some outputs is because you downloaded a subset of them from a substituter, and can always substitute the rest. So this is probably why it doesn’t cause many issues in practice.
The solutions I can see to making it work are special‐casing Mach‐O files for code signing (ugly and imperfect, since there are cases where this kind of substitution will break things on Linux too), some kind of userspace‐defined rewriting hooks (clean and would be important for a working CA derivations design too, but not totally clear how to make it work in terms of threat model), or forbidding all self‐reference (nice in some ways, but of course wildly backwards‐incompatible).
(Off‐topic, but feels worth noting down since this has come up a few times now.)
Note that it's not limited to tests anymore. Some packages (certain PostgreSQL extensions) need to run PostgreSQL at build time, so these are entirely unavailable on darwin.
I opened an issue upstream: https://www.postgresql.org/message-id/flat/a90b5411-705f-4286-bd81-a26c520a6cfb%40technowledgy.de
I think it should be a run-time check to not use System V IPC in the sandbox only, but keep working normally when run on the host.