This commit uses `warn()` to notify the user if sandbox setup fails
with errno==EPERM and /proc/sys/user/max_user_namespaces is missing or
zero, since that is at least part of the reason why sandbox setup
failed.
Note that `echo -n 0 > /proc/sys/user/max_user_namespaces` or
equivalent at boot time has been the recommended mitigation for
several Linux LPE vulnerabilities over the past few years. Many users
have applied this mitigation and then forgotten that they have done
so.
The failure modes for nix's sandboxing setup are pretty complicated.
When nix is unable to set up the sandbox, let's provide more detail
about what went wrong. Specifically:
* Make sure the error message includes the word "sandbox" so the user
knows that the failure was related to sandboxing.
* If `--option sandbox-fallback false` was provided, and removing it
would have allowed further attempts to make progress, let the user
know.
Specifically, if we're not root and the daemon socket does not exist,
then we use ~/.local/share/nix/root as a chroot store. This enables
non-root users to download nix-static and have it work out of the box,
e.g.
ubuntu@ip-10-13-1-146:~$ ~/nix run nixpkgs#hello
warning: '/nix' does not exists, so Nix will use '/home/ubuntu/.local/share/nix/root' as a chroot store
Hello, world!
With this, Nix will write a copy of the sandbox shell to /bin/sh in
the sandbox rather than bind-mounting it from the host filesystem.
This makes /bin/sh work out of the box with nix-static, i.e. you no
longer get
/nix/store/qa36xhc5gpf42l3z1a8m1lysi40l9p7s-bootstrap-stage4-stdenv-linux/setup: ./configure: /bin/sh: bad interpreter: No such file or directory
This allows changes to nix-cache-info to be picked up by existing
clients. Previously, the only way for this to happen would be for
clients to delete binary-cache-v6.sqlite, which is quite awkward for
users.
On the other hand, updates to nix-cache-info should be pretty rare,
hence the choice of a fairly long TTL. Configurability is probably not
useful enough to warrant implementing it.
The manpage for `getgrouplist` says:
> If the number of groups of which user is a member is less than or
> equal to *ngroups, then the value *ngroups is returned.
>
> If the user is a member of more than *ngroups groups, then
> getgrouplist() returns -1. In this case, the value returned in
> *ngroups can be used to resize the buffer passed to a further
> call getgrouplist().
In our original code, however, we allocated a list of size `10` and, if
`getgrouplist` returned `-1` threw an exception. In practice, this
caused the code to fail for any user belonging to more than 10 groups.
While unusual for single-user systems, large companies commonly have a
huge number of POSIX groups users belong to, causing this issue to crop
up and make multi-user Nix unusable in such settings.
The fix is relatively simple, when `getgrouplist` fails, it stores the
real number of GIDs in `ngroups`, so we must resize our list and retry.
Only then, if it errors once more, we can raise an exception.
This should be backported to, at least, 2.9.x.
Without the change any CA deletion triggers linear scan on large
RealisationsRefs table:
sqlite>.eqp full
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SCAN RealisationsRefs
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
With the change it gets turned into a lookup:
sqlite> CREATE INDEX IndexRealisationsRefsRealisationReference on RealisationsRefs(realisationReference);
sqlite> delete from RealisationsRefs where realisationReference IN ( select id from Realisations where outputPath = 1234567890 );
QUERY PLAN
|--SEARCH RealisationsRefs USING INDEX IndexRealisationsRefsRealisationReference (realisationReference=?)
`--LIST SUBQUERY 1
`--SEARCH Realisations USING COVERING INDEX IndexRealisationsRefsOnOutputPath (outputPath=?)
If the derivation `foo` depends on `bar`, and they both have the same
output path (because they are CA derivations), then this output path
will depend both on the realisation of `foo` and of `bar`, which
themselves depend on each other.
This confuses SQLite which isn’t able to automatically solve this
diamond dependency scheme.
Help it by adding a trigger to delete all the references between the
relevant realisations.
Fix#5320
Otherwise the clang builds fail because the constructor of `SQLiteBusy`
inherits it, `SQLiteError::_throw` tries to call it, which fails.
Strangely, gcc works fine with it. Not sure what the correct behavior is
and who is buggy here, but either way, making it public is at the worst
a reasonable workaround
This ensures that use-sites properly trigger new monomorphisations on
one hand, and on the other hand keeps the main `sqlite.hh` clean and
interface-only. I think that is good practice in general, but in this
situation in particular we do indeed have `sqlite.hh` users that don't
need the `throw_` function.