Previously, system call filtering (to prevent builders from storing files with
setuid/setgid permission bits or extended attributes) was performed using a
blocklist. While this looks simple at first, it actually carries significant
security and maintainability risks: after all, the kernel may add new syscalls
to achieve the same functionality one is trying to block, and it can even be
hard to actually add the syscall to the blocklist when building against a C
library that doesn't know about it yet. For a recent demonstration of this
happening in practice to Nix, see the introduction of fchmodat2 [0] [1].
The allowlist approach does not share the same drawback. While it does require
a rather large list of harmless syscalls to be maintained in the codebase,
failing to update this list (and roll out the update to all users) in time has
rather benign effects; at worst, very recent programs that already rely on new
syscalls will fail with an error the same way they would on a slightly older
kernel that doesn't support them yet. Most importantly, no unintended new ways
of performing dangerous operations will be silently allowed.
Another possible drawback is reduced system call performance due to the larger
filter created by the allowlist requiring more computation [2]. However, this
issue has not convincingly been demonstrated yet in practice, for example in
systemd or various browsers.
This commit tries to keep the behavior as close to unchanged as possible. Only
newer syscalls that are not supported by glibc 2.38 (as found in NixOS 23.11)
are blocked. Since this includes fchmodat2, the compatibility code added for
handling this syscall can be removed too.
[0] https://github.com/NixOS/nixpkgs/issues/300635
[1] https://github.com/NixOS/nix/issues/10424
[2] https://github.com/flatpak/flatpak/pull/4462#issuecomment-1061690607
Change-Id: I541be3ea9b249bcceddfed6a5a13ac10b11e16ad
this gives about 20% performance improvements on pure parsing. obviously
it will be less on full eval, but depending on how much parsing is to be
done (e.g. including hackage-packages.nix or not) it's more like 4%-10%.
this has been tested (with thousands of core hours of fuzzing) to ensure
that the ASTs produced by the new parser are exactly the same as the old
one would have produced. error messages will change (sometimes by a lot)
and are not yet perfect, but we would rather leave this as is for later.
test results for running only the parser (excluding the variable binding
code) in a tight loop with inputs and parameters as given are promising:
- 40% faster on lix's package.nix at 10000 iterations
- 1.3% faster on nixpkgs all-packages.nix at 1000 iterations
- equivalent on all of nixpkgs concatenated at 100 iterations
(excluding invalid files, each file surrounded with parens)
more realistic benchmarks are somewhere in between the extremes, parsing
once again getting the largest uplift. other realistic workloads improve
by a few percentage points as well, notably system builds are 4% faster.
Benchmarks summary (from ./bench/summarize.jq bench/bench-*.json)
old/bin/nix --extra-experimental-features 'nix-command flakes' eval -f bench/nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
mean: 0.408s ± 0.025s
user: 0.355s | system: 0.033s
median: 0.389s
range: 0.388s ... 0.442s
relative: 1
new/bin/nix --extra-experimental-features 'nix-command flakes' eval -f bench/nixpkgs/pkgs/development/haskell-modules/hackage-packages.nix
mean: 0.332s ± 0.024s
user: 0.279s | system: 0.033s
median: 0.314s
range: 0.313s ... 0.361s
relative: 0.814
---
old/bin/nix --extra-experimental-features 'nix-command flakes' eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
mean: 6.133s ± 0.022s
user: 5.395s | system: 0.437s
median: 6.128s
range: 6.099s ... 6.183s
relative: 1
new/bin/nix --extra-experimental-features 'nix-command flakes' eval --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
mean: 5.925s ± 0.025s
user: 5.176s | system: 0.456s
median: 5.934s
range: 5.861s ... 5.943s
relative: 0.966
---
GC_INITIAL_HEAP_SIZE=10g old/bin/nix eval --extra-experimental-features 'nix-command flakes' --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
mean: 4.503s ± 0.027s
user: 3.731s | system: 0.547s
median: 4.499s
range: 4.478s ... 4.541s
relative: 1
GC_INITIAL_HEAP_SIZE=10g new/bin/nix eval --extra-experimental-features 'nix-command flakes' --raw --impure --expr 'with import <nixpkgs/nixos> {}; system'
mean: 4.285s ± 0.031s
user: 3.504s | system: 0.571s
median: 4.281s
range: 4.221s ... 4.328s
relative: 0.951
---
old/bin/nix --extra-experimental-features 'nix-command flakes' search --no-eval-cache github:nixos/nixpkgs/e1fa12d4f6c6fe19ccb59cac54b5b3f25e160870 hello
mean: 16.475s ± 0.07s
user: 14.088s | system: 1.572s
median: 16.495s
range: 16.351s ... 16.536s
relative: 1
new/bin/nix --extra-experimental-features 'nix-command flakes' search --no-eval-cache github:nixos/nixpkgs/e1fa12d4f6c6fe19ccb59cac54b5b3f25e160870 hello
mean: 15.973s ± 0.013s
user: 13.558s | system: 1.615s
median: 15.973s
range: 15.946s ... 15.99s
relative: 0.97
---
Change-Id: Ie66ec2d045dec964632c6541e25f8f0797319ee2
These are totally available and you can just turn them on, but they have
very bad dependency tracking and thus bloat incremental change times,
which is not really ok.
Change-Id: Iaa63ed18a789e74fcb757248cd24c3b194afcc80
On operating systems where /bin/sh is not Bash, some scripts are invalid
because of bashisms, and building Lix fails with errors like this:
`render-manpage.sh: 3: set: Illegal option -o pipefail`
This modifies all scripts that use a `/bin/sh` shebang to `/usr/bin/env
bash`, including currently POSIX-compliant ones, to prevent any future
confusion.
Change-Id: Ia074cc6db42d40fc59a63726f6194ea0149ea5e0
I was working on nix-eval-jobs with a dev shell with some shenanigans to
run against a locally built Lix and it was getting really annoying when
`nix develop ../lix#` was messing up my other git repo's hooks.
This is a fix via blunt force, but it is at least obvious how it works.
Change-Id: Ia29eeb5be57ab6a2c88451c00ea18a51e4dfe65e
This is motivated by flakes being bad and all the stuff that calls
things by "system" being utterly unable to cope with cross compilation.
So if we go shove it in package.nix it is suddenly usable from cross
contexts.
Usage:
```
nix build -L .#nix-riscv64-linux.binaryTarball
```
Change-Id: I702ebf2ac5bd9d1c57662f968b000073134df336
-- message from cl/1418 --
The boehmgc changes are bundled into this commit because doing otherwise
would require an annoying dance of "adding compatibility for < 8.2.6 and
>= 8.2.6" then updating the pin then removing the (now unneeded)
compatibility. It doesn't seem worth the trouble to me given the low
complexity of said changes.
Rebased coroutine-sp-fallback.diff patch taken from https://github.com/NixOS/nixpkgs/pull/317227
-- jade resubmit changes --
This is a resubmission of https://gerrit.lix.systems/c/lix/+/1418, which
was reverted in https://gerrit.lix.systems/c/lix/+/1432 for breaking CI
evaluation without being detected.
I have run `nix flake check -Lv` on this one before submission and it
passes on my machine and crucially without eval errors, so the CI result
should be accurate.
It seems like someone renamed forbiddenDependenciesRegex to
forbiddenDependenciesRegexes in nixpkgs and also changed the type
incompatibly. That's pretty silly, but at least it's just an eval error.
Also, `xonsh` regressed the availability of `xonsh-unwrapped`, but it
was fixed by us in https://github.com/NixOS/nixpkgs/pull/317636, which
is now in our channel, so we update nixpkgs compared to the original
iteration of this to simply get that.
We originally had a regression related to some reorganization of the
nixpkgs lib test suite in which there was broken parameter passing.
This, too, we got quickfixed in nixpkgs, so we don't need any changes
for it: https://github.com/NixOS/nixpkgs/pull/317772
Related: https://gerrit.lix.systems/c/lix/+/1428
Fixes: lix-project/lix#385
Change-Id: I26d41ea826fec900ebcad0f82a727feb6bcd28f3
* changes:
releng: add prod environment, ready for release
releng: automatically figure out if we should tag latest for docker
releng: support multiarch docker images
manual: rewrite the docker guide now that we have images
Rewrite docker to be sensible and smaller
Implement docker upload in the releng tools
If we don't want to have separate registry tags by architecture (EWWWW),
we need to be able to build multiarch docker images. This is pretty
simple, and just requires making a manifest pointing to each of the
component images.
I was *going* to just do this API prodding with manifest-tool, but it
doesn't support putting metadata on the outer manifest, which is
actually kind of a problem because it then doesn't render the metadata
on github. So I guess we get a simple little containers API
implementation that is 90% auth code.
Change-Id: I8bdd118d4cbc13b23224f2fb174b232432686bea
The boehmgc changes are bundled into this commit because doing otherwise
would require an annoying dance of "adding compatibility for < 8.2.6 and
>= 8.2.6" then updating the pin then removing the (now unneeded)
compatibility. It doesn't seem worth the trouble to me given the low
complexity of said changes.
Rebased coroutine-sp-fallback.diff patch taken from https://github.com/NixOS/nixpkgs/pull/317227
Change-Id: I8c590e9fe25c0f566d0cfeacb96d8cf50abf12e8
clangd seems to break if GCC is using precompiled headers for C++'s
standard library, so this sets -Denable-pch-std=${stdenv.cc.isClang}
Fixes#374.
Change-Id: Ic4be41ebe7576ebcb9c208275596f953c2003109
It seems like someone implemented precompiled headers a long time ago
and then it never got ported to meson or maybe didn't work at all.
This is, however, blessedly easy to simply implement. I went looking for
`#define` that could affect the result of precompiling the headers, and
as far as I can tell we aren't doing any of that, so this should truly
just be free build time savings.
Previous state:
Compilation (551 times):
Parsing (frontend): 1302.1 s
Codegen & opts (backend): 956.3 s
New state:
**** Time summary:
Compilation (567 times):
Parsing (frontend): 1123.0 s
Codegen & opts (backend): 1078.1 s
I wonder if the "regression" in codegen time is just doing the PCH
operation a few times, because meson does it per-target.
Change-Id: I664366b8069bab4851308b3a7571bea97ac64022
If our shellHook is being run from a nested nix-shell (see 7a12bc200¹),
then (I think) it is run from a bash function due to the nesting, then
`return` is correct. If its `eval`'d though, then there isn't really a
correct way to early exit. So we can just unconditionally be executed in
a function.
Basically, we have IIFE at home.
[1]: 7a12bc2007
Change-Id: Iacad25cbbf66cde2911604e6061e56ad6212af7e
If a nested nix-shell is run inside a nix-shell, then the outer shell's
shellHook will be passed through and run again, unless the nested shell
defines its own.
With lix's hook, this can be annoying: forgetting to exit its nix-shell,
cd'ing to another repository & entering a nested nix-shell will happily
install lix's pre-commit hook in it.
This change makes lix's hook return early in such cases.
Change-Id: I91cb6eb6668f3a8eace36ecbdb01eb367861d77b
Embarrassingly, I submitted a CL overriding submit requirements since
I thought it was spurious failures. However, the CI failure was in fact
real, and I have hopefully learned my lesson. The CI failure is that:
```
vm-test-run-nix-upgrade-nix> machine # installing 'nix-2.18.1'
vm-test-run-nix-upgrade-nix> machine # building '/nix/store/2b6fdf7wvahd00bg2ff0393bhd597a0h-user-environment.drv'...
vm-test-run-nix-upgrade-nix> machine # error: Unable to build profile. There is a conflict for the following files:
vm-test-run-nix-upgrade-nix> machine #
vm-test-run-nix-upgrade-nix> machine # /nix/store/dn6mhhr92bh3ad0n4pd1538ww88khjii-nix-2.18.1/lib/libboost_context.so
vm-test-run-nix-upgrade-nix> machine # /nix/store/w4vffn9iq0znk8bcg5i2giij90xy6db6-lix-2.90.0pre20240523_c97e171/lib/libboost_context.so
vm-test-run-nix-upgrade-nix> machine # error: builder for '/nix/store/2b6fdf7wvahd00bg2ff0393bhd597a0h-user-environment.drv' failed with exit code 1
vm-test-run-nix-upgrade-nix> machine # error: program '/nix/store/w4vffn9iq0znk8bcg5i2giij90xy6db6-lix-2.90.0pre20240523_c97e171/bin/nix-env' failed with exit code 100
```
This is definitely caused by the pname not being the same, so we had
better revert that part of the change until we know we won't regress
anything by doing this.
Fixes: https://gerrit.lix.systems/c/lix/+/1152/5
Change-Id: I0e9d573987f2819c106fb7cea87410fa75152274
This breaks downstreams linking to us on purpose to make sure that if
someone is linking to Lix they're doing it on purpose and crucially not
mixing up Nix and Lix versions in compatibility code.
We still need to fix the internal includes to follow the same schema so
we can drop the single-level include system entirely. However, this
requires a little more effort.
This adds pkg-config for libfetchers and config.h.
Migration path:
expr.hh -> lix/libexpr/expr.hh
nix/config.h -> lix/config.h
To apply this migration automatically, remove all `<nix/>` from
includes, so: `#include <nix/expr.hh>` -> `#include <expr.hh>`. Then,
the correct paths will be resolved from the tangled mess, and the
clang-tidy automated fix will work.
Then run the following for out of tree projects:
```
lix_root=$HOME/lix
(cd $lix_root/clang-tidy && nix develop -c 'meson setup build && ninja -C build')
run-clang-tidy -checks='-*,lix-fixincludes' -load=$lix_root/clang-tidy/build/liblix-clang-tidy.so -p build/ -fix src
```
Related: lix-project/nix-eval-jobs#5
Fixes: lix-project/lix#279
Change-Id: I7498e903afa6850a731ef8ce77a70da6b2b46966
Fixes#183, #110, #116.
The default flake-registry option becomes 'vendored', and refers
to a vendored flake-registry.json file in the install path.
Vendored copy of the flake-registry is from github:NixOS/flake-registry
at commit 9c69f7bd2363e71fe5cd7f608113290c7614dcdd.
Change-Id: I752b81c85ebeaab4e582ac01c239d69d65580f37
This should have been in there originally, which is our mistake,
considering that debugging CI failures is basically impossible without
it.
Change-Id: I4ab8799e6e0abca1984ed9801fe10c58200861a3
We don't need bear anymore, since we don't have any more bad build
systems that lack compile commands generation inside Lix.
Change-Id: I7809ddfd993180468f846e8cd862bdd54d5b31ec
Surely if you have unreleased changes you want them on a page right?
`officialRelease` means "this is a *release version*", which is a
reasonable case to not want it, but we are not that here.
I understand wanting to be able to turn it off for deps reasons or
something, but other than that, uhh, seems better to just turn it on
always; it is basically free compute-wise to the point we run it on
pre-commit.
Part two of fixing lix#297.
Fixes: lix-project/lix#297
Change-Id: I0f8dd1ae42458df371aef529c456e47a7ac04ae0
Now instead of a derivation overridden from Lix, we use a mkShell
derivation parameterized on an already called package.nix. This also
lets callPackage take care of the buildPackages distinction for the
devShell.
Change-Id: I5ddfec40d83fa6136032da7606fe6d3d5014ef42
Because of an objc quirk[1], calling curl_global_init for the first time
after fork() will always result in a crash.
Up until now the solution has been to set
OBJC_DISABLE_INITIALIZE_FORK_SAFETY for every nix process to ignore
that error.
This is less than ideal because we were setting it in package.nix,
which meant that running nix tests locally would fail because
that variable was not set.
Instead of working around that error we address it at the core -
by calling curl_global_init inside initLibStore, which should mean
curl will already have been initialized by the time we try to do so in
a forked process.
[1] 01edf1705f/runtime/objc-initialize.mm (L614-L636)
Change-Id: Icf26010a8be655127cc130efb9c77b603a6660d0
This has the following downsides:
* you cannot build Lix against nixos-unstable.
* this will immediately break as soon as libseccomp will hit
nixos-23.11 (given that people will probably use the package.nix via
our overlay or override nixpkgs via `follows`).
Hence, removing the assert again and add a better FIXME comment.
Change-Id: I284e10cf08e1873fef70ed869a1638aa89792422
With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that
isn't filtered away by the libseccomp sandbox.
Being able to use this to bypass that restriction has surprising results
for some builds such as lxc[1]:
> With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2,
> which slips through 9b88e52846/src/libstore/build/local-derivation-goal.cc (L1650-L1663).
> The fixupPhase then uses fchmodat, which fails.
> With older kernel or glibc, setting the suid bit fails in the
> install phase, which is not treated as fatal, and then the
> fixup phase does not try to set it again.
Please note that there are still ways to bypass this sandbox[2] and this is
mostly a fix for the breaking builds.
This change works by creating a syscall filter for the `fchmodat2`
syscall (number 452 on most systems). The problem is that glibc 2.39
is needed to have the correct syscall number available via
`__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on
nixpkgs 23.11. To have this change everywhere and not dependent on the
glibc this package is built against, I added a header
"fchmodat2-compat.hh" that sets the syscall number based on the
architecture. On most platforms its 452 according to glibc with a few
exceptions:
$ rg --pcre2 'define __NR_fchmodat2 (?!452)'
sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
58:#define __NR_fchmodat2 1073742276
sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
67:#define __NR_fchmodat2 6452
sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
62:#define __NR_fchmodat2 5452
sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
70:#define __NR_fchmodat2 4452
sysdeps/unix/sysv/linux/alpha/arch-syscall.h
59:#define __NR_fchmodat2 562
I added a small regression-test to the setuid integration-test that
attempts to set the suid bit on a file using the fchmodat2 syscall.
I confirmed that the test fails without the change in
local-derivation-goal.
Additionally, we require libseccomp 2.5.5 or greater now: as it turns
out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Lix against the project's flake.
To work around that
* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
scheduled[3].
* the package now uses libseccomp 2.5.5 on its own already. This is to
provide a quick fix since the correct fix for 23.11 is still a staging cycle
away.
We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷
Relevant PRs in CppNix:
* https://github.com/NixOS/nix/pull/10591
* https://github.com/NixOS/nix/pull/10501
[1] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2031073804
[2] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2030844251
[3] https://github.com/NixOS/nixpkgs/pull/306070
(cherry picked from commit ba6804518772e6afb403dd55478365d4b863c854)
Change-Id: I6921ab5a363188c6bff617750d00bb517276b7fe