Commit graph

39 commits

Author SHA1 Message Date
9909a175bf Fix /etc/group having desynced IDs from the actual UID in the sandbox
This was found when `logrotate.conf` failed to build in a NixOS system
with:

    /nix/store/26zdl4pyw5qazppj8if5lm8bjzxlc07l-coreutils-9.3/bin/id: cannot find name for group ID 30000

This was surprising because it seemed to mean that /etc/group was busted
in the sandbox. Indeed it was:

    root0:
    nixbld:!💯
    nogroup65534:

We diagnosed this to sandboxUid() being called before
usingUserNamespace() was called, in setting up /etc/group inside the
sandbox. This code desperately needs refactoring.

We also moved the /etc/group code to be with the /etc/passwd code, but
honestly this code is all spaghetti'd all over the place and needs some
more serious tidying than we did here.

We also moved some checks to be earlier to improve locality with where
the things they are checking come from.

Change-Id: Ie29798771f3593c46ec313a32960fa955054aceb
2024-05-04 17:36:50 -07:00
79d0ae6670 Merge "libstore/local-derivation-goal: prohibit creating setuid/setgid binaries" into main 2024-05-04 07:26:15 +00:00
045ee37438 libstore/local-derivation-goal: prohibit creating setuid/setgid binaries
With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that
isn't filtered away by the libseccomp sandbox.

Being able to use this to bypass that restriction has surprising results
for some builds such as lxc[1]:

> With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2,
> which slips through 9b88e52846/src/libstore/build/local-derivation-goal.cc (L1650-L1663).
> The fixupPhase then uses fchmodat, which fails.
> With older kernel or glibc, setting the suid bit fails in the
> install phase, which is not treated as fatal, and then the
> fixup phase does not try to set it again.

Please note that there are still ways to bypass this sandbox[2] and this is
mostly a fix for the breaking builds.

This change works by creating a syscall filter for the `fchmodat2`
syscall (number 452 on most systems). The problem is that glibc 2.39
is needed to have the correct syscall number available via
`__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on
nixpkgs 23.11. To have this change everywhere and not dependent on the
glibc this package is built against, I added a header
"fchmodat2-compat.hh" that sets the syscall number based on the
architecture. On most platforms its 452 according to glibc with a few
exceptions:

    $ rg --pcre2 'define __NR_fchmodat2 (?!452)'
    sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
    58:#define __NR_fchmodat2 1073742276

    sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
    67:#define __NR_fchmodat2 6452

    sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
    62:#define __NR_fchmodat2 5452

    sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
    70:#define __NR_fchmodat2 4452

    sysdeps/unix/sysv/linux/alpha/arch-syscall.h
    59:#define __NR_fchmodat2 562

I added a small regression-test to the setuid integration-test that
attempts to set the suid bit on a file using the fchmodat2 syscall.
I confirmed that the test fails without the change in
local-derivation-goal.

Additionally, we require libseccomp 2.5.5 or greater now: as it turns
out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Lix against the project's flake.

To work around that

* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
  scheduled[3].

* the package now uses libseccomp 2.5.5 on its own already. This is to
  provide a quick fix since the correct fix for 23.11 is still a staging cycle
  away.

We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷

Relevant PRs in CppNix:

* https://github.com/NixOS/nix/pull/10591
* https://github.com/NixOS/nix/pull/10501

[1] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2031073804
[2] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2030844251
[3] https://github.com/NixOS/nixpkgs/pull/306070

(cherry picked from commit ba6804518772e6afb403dd55478365d4b863c854)
Change-Id: I6921ab5a363188c6bff617750d00bb517276b7fe
2024-05-03 16:29:06 +02:00
e2ab89a74b add VM test for nix upgrade-nix
This commit adds a new NixOS VM test, which tests that `nix upgrade-nix`
works on both kinds of profiles (manifest.nix and manifest.json).

Done as a separate commit from 831d18a13, since it relies on the
--store-path argument from 026c90e5f as well.

Change-Id: I5fc94b751d252862cb6cffb541a4c072faad9f3b
2024-04-29 01:19:21 +00:00
8773439a85 Merge "ssh-ng: Set log-fd for ssh to 4 by default" into main 2024-04-26 18:30:33 +00:00
104448e75d ssh-ng: Set log-fd for ssh to 4 by default
That's expected by `build-remote` and makes sure that errors are
correctly forwarded to the user. For instance, let's say that the
host-key of `example.org` is unknown and

    nix-build ../nixpkgs -A hello -j0 --builders 'ssh-ng://example.org'

is issued, then you get the following output:

    cannot build on 'ssh-ng://example.org?&': error: failed to start SSH connection to 'example.org'
    Failed to find a machine for remote build!
    derivation: yh46gakxq3kchrbihwxvpn5bmadcw90b-hello-2.12.1.drv
    required (system, features): (x86_64-linux, [])
    2 available machines:
    [...]

The relevant information (`Host key verification failed`) ends up in the
daemon's log, but that's not very obvious considering that the daemon
isn't very chatty normally.

This can be fixed - the same way as its done for legacy-ssh - by passing
fd 4 to the SSH wrapper. Now you'd get the following error:

    cannot build on 'ssh-ng://example.org': error: failed to start SSH connection to 'example.org': Host key verification failed.
    Failed to find a machine for remote build!
    [...]

...and now it's clear what's wrong.

Please note that this is won't end up in the derivation's log.

For previous discussion about this change see
https://github.com/NixOS/nix/pull/7659.

Change-Id: I5790856dbf58e53ea3e63238b015ea06c347cf92
2024-04-26 19:04:06 +02:00
7063170d5f tests: add error messages to the asserts in tarball flakes test
In hopes of avoiding opaque error messages like the one in
https://buildbot.lix.systems/#/builders/49/builds/1054/steps/1/logs/stdio

Traceback (most recent call last):
  File "/nix/store/wj6wh89jhd2492r781qsr09r9wydfs6m-nixos-test-driver-1.1/bin/.nixos-test-driver-wrapped", line 9, in <module>
    sys.exit(main())
             ^^^^^^
  File "/nix/store/wj6wh89jhd2492r781qsr09r9wydfs6m-nixos-test-driver-1.1/lib/python3.11/site-packages/test_driver/__init__.py", line 126, in main
    driver.run_tests()
  File "/nix/store/wj6wh89jhd2492r781qsr09r9wydfs6m-nixos-test-driver-1.1/lib/python3.11/site-packages/test_driver/driver.py", line 159, in run_tests
    self.test_script()
  File "/nix/store/wj6wh89jhd2492r781qsr09r9wydfs6m-nixos-test-driver-1.1/lib/python3.11/site-packages/test_driver/driver.py", line 151, in test_script
    exec(self.tests, symbols, None)
  File "<string>", line 13, in <module>
AssertionError

Change-Id: Idd2212a1c3714ce58c7c3a9f34c2ca4313eb6d55
2024-04-22 16:13:36 -06:00
272c2ff15f remove extraneous cache entry from github fetcher
This isn't necessary, as it's already covered by the tarball fetcher's
cache.

Change-Id: I85e35f5a61594f27b8f30d82145f92c5d6559e1f
2024-04-21 10:46:05 +00:00
a326344253 tests: unhaunt the flakes nixos tests
these should really wait for networks to come up, otherwise they can fail.

fixes #235

Change-Id: I08989e8bdb0de280df74660ac43983de5c34fa9d
2024-04-18 20:09:19 +00:00
effc28f6f5 libstore/build: set NO_NEW_PRIVS for the sandbox
Change-Id: I711f64e2b68495ed9c85c1a4bd5025405805e43a
2024-04-15 10:25:29 +03:00
b469c6509b libstore/build: just copy the magic /etc files into the sandbox
Saves us a bunch of thinking about how to handle symlinks, and prevents
the DNS config from changing on the fly under the build, which may or may
not be a good thing?

Change-Id: I071e6ae7e220884690b788d94f480866f428db71
2024-04-13 12:43:19 +03:00
2a98ba8b97 Add pre-commit checks
The big ones here are `trim-trailing-whitespace` and `end-of-file-fixer`
(which makes sure that every file ends with exactly one newline
character).

Change-Id: Idca73b640883188f068f9903e013cf0d82aa1123
2024-03-29 22:57:40 -07:00
34fb7a7e9d make the multi-node vm tests a bit more reliable
without these changes the tests will very repeatably (although not very
reliably) wedge in our runs. the ssh command starts, opens a sessions,
does something, the session closes again, but the test does not move on.
adding *just* the redirect and not the unit waits is not sufficient
either, it needs both. this feels like a bug in the nixos testing
framework somewhere, but digging that far is not in the cards right now.

Change-Id: Idab577b83a36cc4899bb5ffbb3d9adc04e83e51c
2024-03-10 10:10:52 +01:00
80b79d0137 flake.nix: upgrade to nixos-23.11
This also bypasses the Objective-C fork safety during tests.

Change-Id: I92bf9f911e8a1fbd32eae13255f9a9dabde40b21
2024-03-08 23:59:01 +00:00
ca03f7cc28 Merge pull request #9676 from DavHau/git-testsuite
initialize test suite for git fetchers

(cherry picked from commit 0bd9e10aea747df51c8a5af124864c722cbeafde)
Change-Id: Idf94a47794190c3e1de07fc4e7848741c4e9ffed
2024-03-07 10:25:03 +01:00
b87f059ed4 Merge pull request #9631 from cole-h/fixup-check-warnings
Fix warnings when running checks

(cherry picked from commit 75e10e42f3c63fd9b9c8cf222b992ab77e497854)
Change-Id: Id955008fe045f23f72fae2a2cdf8f7ccddd1e6b9
2024-03-07 09:58:15 +01:00
f10e673272 tests/nixos/remote-builds*: Inline module + format
(cherry picked from commit 5167351efbee5c5a7390510eb720c31c6976f4d9)
Change-Id: I0caba23b589ed428d08895d7b8f0c22532bd259e
2024-03-06 21:32:47 -07:00
097acae35b tests/nixos: Test remote build against older versions
(cherry picked from commit e502d1cf945fb3cdd0ca1e1c16ec330ccab51c7b)
Change-Id: If6a1758b6457c5dae9305829c4d71d1905cfca22
2024-03-06 21:32:47 -07:00
706f0df55b Merge pull request #9280 from R-VdP/rvdp/fix_remote_logging_phase_reporting
Include phase reporting in log file for ssh-ng builds

(cherry picked from commit b1e7d7cad625095656fff05ac4aedeb12135110a)
Change-Id: I4076669b0ba160412f7c628ca9113f9abbc8c303
2024-03-06 19:11:12 -07:00
6f36a8834c Copy the output of fixed-output derivations before registering them
It is possible to exfiltrate a file descriptor out of the build sandbox
of FODs, and use it to modify the store path after it has been
registered. To avoid that issue, don't register the output of the build,
but a copy of it (that will be free of any leaked file descriptor).

Test that we can't leverage abstract unix domain sockets to leak file
descriptors out of the sandbox and modify the path after it has been
registered.

(cherry picked from commit 2dadfeb690e7f4b8f97298e29791d202fdba5ca6)
(tests cherry picked from commit c854ae5b3078ac5d99fa75fe148005044809e18c)

Co-authored-by: Valentin Gagarin <valentin.gagarin@tweag.io>
Co-authored-by: Theophane Hufschmitt <theophane.hufschmitt@tweag.io>
Co-authored-by: Tom Bereknyei <tomberek@gmail.com>

Change-Id: I87cd58f1c0a4f7b7a610d354206b33301e47b1a4
2024-03-07 01:44:58 +00:00
2e1f5e2666 Merge pull request #9105 from Ericson2314/split-out-nixos-tests
Define NixOS tests in `tests/nixos/default.nix` rather than `flake.nix`

(cherry picked from commit c29b8ba142a0650d1182ca838ddc1b2d273dcd2a)
Change-Id: Ieae1b6476d95024485df7067e008013bc5542039
2024-03-05 21:11:59 +01:00
Cole Helbling
f3005632c4 Re-enable systemd-nspawn test
It was disabled in c6953d1ff6 because
a recent Nixpkgs bump brought in a new systemd which changed how
systemd-nspawn worked.

As far as I can tell, the issue was caused by this upstream systemd
commit:
b71a0192c0

Bind-mounting the host's `/sys` and `/proc` into the container's
`/run/host/{sys,proc}` fixes the issue and allows the test to succeed.

(cherry picked from commit 883092e3f78d4efb1066a2e24e343b307035a04c)
2023-09-20 17:03:47 +00:00
Eelco Dolstra
b6b2a0aea9 Use "touch -h"
https://hydra.nixos.org/build/235888160

This is needed because Nixpkgs now contains dangling symlinks
(pkgs/test/nixpkgs-check-by-name/tests/symlink-invalid/pkgs/by-name/fo/foo/foo.nix).
2023-09-19 17:21:07 +02:00
Eelco Dolstra
c6953d1ff6 Disable systemd-nspawn test
This is broken because of a change in systemd in NixOS 23.05. It fails
with

  Failed to mount proc (type proc) on /proc (MS_NOSUID|MS_NODEV|MS_NOEXEC ""): Operation not permitted
2023-09-19 17:03:21 +02:00
be3362e747 Fix nix-copy test 2023-08-30 19:35:02 -04:00
Robert Hensing
0e3a7e34a0
Merge pull request #8506 from corngood/ssh-master
Pass NIX_SSHOPTS when checking for an ssh master connection.
2023-07-18 15:47:57 +02:00
Jean-François Roche
80c9259756 Allow to sign path as unprivileged user
User can now sign path as unprivileged/allowed user

refs #1708
2023-06-27 18:31:31 +02:00
Eelco Dolstra
1ad3328c5e Allow tarball URLs to redirect to a lockable immutable URL
Previously, for tarball flakes, we recorded the original URL of the
tarball flake, rather than the URL to which it ultimately
redirects. Thus, a flake URL like
http://example.org/patchelf-latest.tar that redirects to
http://example.org/patchelf-<revision>.tar was not really usable. We
couldn't record the redirected URL, because sites like GitHub redirect
to CDN URLs that we can't rely on to be stable.

So now we use the redirected URL only if the server returns the
`x-nix-is-immutable` or `x-amz-meta-nix-is-immutable` headers in its
response.
2023-06-13 14:17:45 +02:00
David McFarland
5454fdcceb Add test of explicit ssh control path in nix-copy test
This highlights a problem caused by SSHMaster::isMasterRunning returning
false when NIX_SSHOPTS contains -oControlPath.
2023-06-13 00:54:52 -03:00
Alexander Bantyev
992e2ed0cf
Add a test for ControlMaster 2023-05-17 11:34:45 +04:00
Alexander Bantyev
85a2d1d94f
Add a test for nix copy over ssh
Check that nix copy can copy stuff, refuses to copy unsigned paths by
default, and doesn't hide the ssh password prompt.
2023-03-22 09:45:08 +04:00
49fd72a903
Make /etc writability conditional on uid-range feature 2023-02-14 13:55:41 +01:00
ad1f61c39b
container test: make /etc writable 2023-02-14 12:26:40 +01:00
Eelco Dolstra
67451d8ed7
Merge pull request #7802 from edolstra/fix-7783
Fix PID namespace support check
2023-02-10 20:41:13 +01:00
Eelco Dolstra
a21405a4e8 Add regression test 2023-02-10 17:51:44 +01:00
Robert Hensing
7908a41631
tests/authorization: Simplify assertion
Co-authored-by: Théophane Hufschmitt <7226587+thufschmitt@users.noreply.github.com>
2023-02-10 13:03:24 +01:00
72b18f05a2 Add a basic daemon authorization test 2023-02-07 16:43:09 +01:00
261c25601d Use the official, documented NixOS runTest interface 2023-01-20 16:23:52 +01:00
74026bb101 tests: Move NixOS tests to tests/nixos
This will allow contributors to find them more easily.
2023-01-20 15:33:13 +01:00