check goals for timeouts first, and their activity fds only if no
timeout has occurred. checking for timeouts *after* activity sets
us up for assertion failures by running multiple build completion
notifiers, the first of which will kill/reap the the goal process
and consuming the Pid instance. when the second notifier attempts
to do the same it will core dump with an assertion failure in Pid
and take down not only the single goal, but the entire daemon and
all goals it was building. luckily this is rare in practice since
it requires a build to both finish and time out at the same time.
writing a test for this is not feasible due to how much it relies
on scheduling to actually trigger the underlying bug, but on idle
machines it can usually be triggered by running multiple sleeping
builds with timeout set to the sleep duration and `--keep-going`:
nix-build --timeout 10 --builders '' --keep-going -E '
with import <nixpkgs> {};
builtins.genList
(i: runCommand "foo-${toString i}" {} "sleep 10")
100
'
Change-Id: I394d36b2e5ffb909cf8a19977d569bbdb71cb67b
The `builder` local variable and duplicate `args.push_back` are no
longer required since the Darwin sandbox stopped using `sandbox-exec`.
The `drv->isBuiltin` check is not required either, as args are not
accessed when the builder is builtin.
Change-Id: I80b939bbd6f727b01793809921810ff09b579d54
Seccomp filtering and the no-new-privileges functionality improve the security
of the sandbox, and have been enabled by default for a long time. In
#265 it was decided that they
should be enabled unconditionally. Accordingly, remove the allow-new-privileges
(which had weird behavior anyway) and filter-syscall settings, and force the
security features on. Syscall filtering can still be enabled at build time to
support building on architectures libseccomp doesn't support.
Change-Id: Iedbfa18d720ae557dee07a24f69b2520f30119cb
This breaks downstreams linking to us on purpose to make sure that if
someone is linking to Lix they're doing it on purpose and crucially not
mixing up Nix and Lix versions in compatibility code.
We still need to fix the internal includes to follow the same schema so
we can drop the single-level include system entirely. However, this
requires a little more effort.
This adds pkg-config for libfetchers and config.h.
Migration path:
expr.hh -> lix/libexpr/expr.hh
nix/config.h -> lix/config.h
To apply this migration automatically, remove all `<nix/>` from
includes, so: `#include <nix/expr.hh>` -> `#include <expr.hh>`. Then,
the correct paths will be resolved from the tangled mess, and the
clang-tidy automated fix will work.
Then run the following for out of tree projects:
```
lix_root=$HOME/lix
(cd $lix_root/clang-tidy && nix develop -c 'meson setup build && ninja -C build')
run-clang-tidy -checks='-*,lix-fixincludes' -load=$lix_root/clang-tidy/build/liblix-clang-tidy.so -p build/ -fix src
```
Related: lix-project/nix-eval-jobs#5
Fixes: #279
Change-Id: I7498e903afa6850a731ef8ce77a70da6b2b46966
Move the identical static `chmod_` functions in libstore to
libutil. the function is called `chmodPath` instead of `chmod`
as otherwise it will shadow the standard library chmod in the nix
namespace, which is somewhat confusing.
Change-Id: I7b5ce379c6c602e3d3a1bbc49dbb70b1ae8f7bad
having the serializer write into `*conn` is not legal because we are
in a sinkToSource that will be drained by the remote we're connected
to. writing into `*conn` directly can break the framing protocol. it
is unlikely this code was ever run: to protocol it caters to is from
2016(!) and thoroughly untested in-tree, and since it's been present
since nix 2.17 and the 1.18 protocol broken here is nix 2.0 we might
safely assume that daemons older than nix 2.1 are no longer used now
see also #325 (though that wants <2.3 gone, this is sadly only <2.1)
Change-Id: I9d674c18f6d802f61c5d85dfd9608587b73e70a5
On several occasions I've found myself confused when trying to delete
a store path, because I am told it's still alive, but
nix-store --query --roots doesn't show anything. Let's save future
users this confusion by mentioning that a path might be alive due to
having referrers, not just roots.
(cherry picked from commit 979a019014569eee7d0071605f6ff500b544f6ac)
Upstream-PR: https://github.com/NixOS/nix/pull/10733
Change-Id: I54ae839a85f3de3393493fba27fd40d7d3af0516
Example: /nix/store/dr53sp25hyfsnzjpm8mh3r3y36vrw3ng-neovim-0.9.5^out
This is nonsensical since selecting outputs can only be done for a
buildable derivation, not for a realised store path. The build worker
side of things ends up crashing with an assertion when trying to handle
such malformed paths.
Change-Id: Ia3587c71fe3da5bea45d4e506e1be4dd62291ddf
it's no longer used. it really shouldn't have existed this long since it
was just a mashup of both std::promise and std::packaged_task in a shape
that makes composition unnecessarily difficult. all but a single case of
Callback pattern calls were fully synchronous anyway, and even this sole
outlier was by far not important enough to justify the extra complexity.
Change-Id: I208aec4572bf2501cdbd0f331f27d505fca3a62f
also add a few more tests for exception propagation behavior. using
packaged_tasks and futures (which only allow a single call to a few
of their methods) introduces error paths that weren't there before.
Change-Id: I42ca5236f156fefec17df972f6e9be45989cf805
this is the *only* real user of file transfer download completion
callbacks, and a pretty spurious user at that (seeing how nothing
here is even turned on by default and indeed a dependency of path
substitution which *isn't* async, and concurrency-limited). it'll
be a real pain to keep this around, and realistically it would be
a lot better to overhaul substitution in general to be *actually*
async. that requires a proper async framework footing though, and
we don't have anything of the sort, but it's also blocking *that*
Change-Id: I1bf671f217c654a67377087607bf608728cbfc83
The fix for the Darwin vulnerability in ecdbc3b207
also broke setting `__sandboxProfile` when `sandbox=relaxed` or
`sandbox=false`. This cppnix change fixes `sandbox=relaxed` and
adds a suitable test.
Co-Authored-By: Artemis Tosini <lix@artem.ist>
Co-Authored-By: Eelco Dolstra <edolstra@gmail.com>
Change-Id: I40190f44f3e1d61846df1c7b89677c20a1488522
Because of an objc quirk[1], calling curl_global_init for the first time
after fork() will always result in a crash.
Up until now the solution has been to set
OBJC_DISABLE_INITIALIZE_FORK_SAFETY for every nix process to ignore
that error.
This is less than ideal because we were setting it in package.nix,
which meant that running nix tests locally would fail because
that variable was not set.
Instead of working around that error we address it at the core -
by calling curl_global_init inside initLibStore, which should mean
curl will already have been initialized by the time we try to do so in
a forked process.
[1] 01edf1705f/runtime/objc-initialize.mm (L614-L636)
Change-Id: Icf26010a8be655127cc130efb9c77b603a6660d0
only two users of this function exist. only one used it in a way that
even bears resemblance to asynchronicity, and even that one didn't do
it right. fully async and parallel computation would have only worked
if any getEdgesAsync never calls the continuation it receives itself,
only from more derived callbacks running on other threads. calling it
directly would cause the decoupling promise to be awaited immediately
*on the original thread*, completely negating all nice async effects.
Change-Id: I0aa640950cf327533a32dee410105efdabb448df
this seems to be an oversight, considering that regular substitutions
are concurrency-limited. while not particularly necessary at present,
once we've removed the `Callback` based interfaces it will be needed.
Change-Id: Ide2d08169fcc24752cbd07a1d33fb8482f7034f5
When /nix/var (or, more precisely, NIX_STATE_DIR) does not exist at all,
Lix falls back to creating an adhoc chroot store in XDG_DATA_HOME.
b247ef72d[1] changed the way Store classes are initialized, and in the
migration, a `params2` was accidentally changed to `params`. This commit
restores the correct behavior, and in lieu of a single *character* fix,
this commit also changes the variable name to something more reasonable.
Fixes#274.
[1]: b247ef72dc
n.b., this code might deserve some more looking at anyway. this fallback
store creation throws away *all* Store params passed to
openFromNonUri() in favor of an entirely new set which only contains
the `root` param, which may or may not be the correct behavior
Change-Id: Ibea559b88a50e6d6e75a1f87d9d7816cabb2a8f3
returning 0 from the callback for errors signals successful transfer if
the source returned no data even though the exception we've just caught
clearly disagrees. while this is not all that important (since the only
viable cause of such errors will be dataCallback, and the sole instance
of it being used already takes care of exceptions) we can just do this.
Change-Id: I2bb150eff447121d82e8e3aa4e00057c40523ac6
this will be necessary if we want download() to return a source instead
of consuming a sink, which will in turn be needed to remove coroutines.
Change-Id: I34ec241e9bbc5d32fbcd243b244e29c3757533aa
not doing this will cause transfers that had their readers disappear to
linger. with lingering transfers the curl thread can't shut down, which
will cause nix itself to not shut down until the transfer finishes some
other way (most likely network timeouts). also add a new test for this.
Change-Id: Id2401b3ac85731c824db05918d4079125be25b57
This was found when `logrotate.conf` failed to build in a NixOS system
with:
/nix/store/26zdl4pyw5qazppj8if5lm8bjzxlc07l-coreutils-9.3/bin/id: cannot find name for group ID 30000
This was surprising because it seemed to mean that /etc/group was busted
in the sandbox. Indeed it was:
root❌0:
nixbld:!💯
nogroup❌65534:
We diagnosed this to sandboxUid() being called before
usingUserNamespace() was called, in setting up /etc/group inside the
sandbox. This code desperately needs refactoring.
We also moved the /etc/group code to be with the /etc/passwd code, but
honestly this code is all spaghetti'd all over the place and needs some
more serious tidying than we did here.
We also moved some checks to be earlier to improve locality with where
the things they are checking come from.
Change-Id: Ie29798771f3593c46ec313a32960fa955054aceb
With Linux kernel >=6.6 & glibc 2.39 a `fchmodat2(2)` is available that
isn't filtered away by the libseccomp sandbox.
Being able to use this to bypass that restriction has surprising results
for some builds such as lxc[1]:
> With kernel ≥6.6 and glibc 2.39, lxc's install phase uses fchmodat2,
> which slips through 9b88e52846/src/libstore/build/local-derivation-goal.cc (L1650-L1663).
> The fixupPhase then uses fchmodat, which fails.
> With older kernel or glibc, setting the suid bit fails in the
> install phase, which is not treated as fatal, and then the
> fixup phase does not try to set it again.
Please note that there are still ways to bypass this sandbox[2] and this is
mostly a fix for the breaking builds.
This change works by creating a syscall filter for the `fchmodat2`
syscall (number 452 on most systems). The problem is that glibc 2.39
is needed to have the correct syscall number available via
`__NR_fchmodat2` / `__SNR_fchmodat2`, but this flake is still on
nixpkgs 23.11. To have this change everywhere and not dependent on the
glibc this package is built against, I added a header
"fchmodat2-compat.hh" that sets the syscall number based on the
architecture. On most platforms its 452 according to glibc with a few
exceptions:
$ rg --pcre2 'define __NR_fchmodat2 (?!452)'
sysdeps/unix/sysv/linux/x86_64/x32/arch-syscall.h
58:#define __NR_fchmodat2 1073742276
sysdeps/unix/sysv/linux/mips/mips64/n32/arch-syscall.h
67:#define __NR_fchmodat2 6452
sysdeps/unix/sysv/linux/mips/mips64/n64/arch-syscall.h
62:#define __NR_fchmodat2 5452
sysdeps/unix/sysv/linux/mips/mips32/arch-syscall.h
70:#define __NR_fchmodat2 4452
sysdeps/unix/sysv/linux/alpha/arch-syscall.h
59:#define __NR_fchmodat2 562
I added a small regression-test to the setuid integration-test that
attempts to set the suid bit on a file using the fchmodat2 syscall.
I confirmed that the test fails without the change in
local-derivation-goal.
Additionally, we require libseccomp 2.5.5 or greater now: as it turns
out, libseccomp maintains an internal syscall table and
validates each rule against it. This means that when using libseccomp
2.5.4 or older, one may pass `452` as syscall number against it, but
since it doesn't exist in the internal structure, `libseccomp` will refuse
to create a filter for that. This happens with nixpkgs-23.11, i.e. on
stable NixOS and when building Lix against the project's flake.
To work around that
* a backport of libseccomp 2.5.5 on upstream nixpkgs has been
scheduled[3].
* the package now uses libseccomp 2.5.5 on its own already. This is to
provide a quick fix since the correct fix for 23.11 is still a staging cycle
away.
We still need the compat header though since `SCMP_SYS(fchmodat2)`
internally transforms this into `__SNR_fchmodat2` which points to
`__NR_fchmodat2` from glibc 2.39, so it wouldn't build on glibc 2.38.
The updated syscall table from libseccomp 2.5.5 is NOT used for that
step, but used later, so we need both, our compat header and their
syscall table 🤷
Relevant PRs in CppNix:
* https://github.com/NixOS/nix/pull/10591
* https://github.com/NixOS/nix/pull/10501
[1] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2031073804
[2] https://github.com/NixOS/nixpkgs/issues/300635#issuecomment-2030844251
[3] https://github.com/NixOS/nixpkgs/pull/306070
(cherry picked from commit ba6804518772e6afb403dd55478365d4b863c854)
Change-Id: I6921ab5a363188c6bff617750d00bb517276b7fe
Currently LocalDerivationGoal allows setting `__sandboxProfile`
to add sandbox parameters on Darwin when `sandbox=true`.
This was only supposed to have an effect when `sandbox=relaxed`
Change-Id: Ide44ee82d7e4d6b545285eab26547e7014817d3f
As discussed in the maintainer meeting on 2024-01-29.
Mainly this is to avoid a situation where the name is parsed and
treated as a file name, mostly to protect users.
.-* and ..-* are also considered invalid because they might strip
on that separator to remove versions. Doesn't really work, but that's
what we decided, and I won't argue with it, because .-* probably
doesn't seem to have a real world application anyway.
We do still permit a 1-character name that's just "-", which still
poses a similar risk in such a situation. We can't start disallowing
trailing -, because a non-zero number of users will need it and we've
seen how annoying and painful such a change is.
What matters most is preventing a situation where . or .. can be
injected, and to just get this done.
(cherry picked from commit f1b4663805a9dbcb1ace64ec110092d17c9155e0)
Change-Id: I900a8509933cee662f888c3c76fa8986b0058839
This replaces the external sandbox-exec call with direct calls into
libsandbox. This API is technically deprecated and is missing some
prototypes, but all major browsers depend on it, so it is unlikely to
materially change without warning.
This commit also ensures the netrc file is only written if the
derivation is in fact meant to be able to access the internet.
This change commits a sin of not actually actively declaring its
dependency on macOS's libsandbox.dylib; this is due to the dylib
cache in macOS making that explicit dependency unnecessary. In the
future this might become a problem, so this commit marks our sins.
Co-authored-by: Artemis Tosini <lix@artem.ist>
Co-authored-by: Lunaphied <lunaphied@lunaphied.me>
Change-Id: Ia302141a53ce7b0327c1aad86a117b6645fe1189
That's expected by `build-remote` and makes sure that errors are
correctly forwarded to the user. For instance, let's say that the
host-key of `example.org` is unknown and
nix-build ../nixpkgs -A hello -j0 --builders 'ssh-ng://example.org'
is issued, then you get the following output:
cannot build on 'ssh-ng://example.org?&': error: failed to start SSH connection to 'example.org'
Failed to find a machine for remote build!
derivation: yh46gakxq3kchrbihwxvpn5bmadcw90b-hello-2.12.1.drv
required (system, features): (x86_64-linux, [])
2 available machines:
[...]
The relevant information (`Host key verification failed`) ends up in the
daemon's log, but that's not very obvious considering that the daemon
isn't very chatty normally.
This can be fixed - the same way as its done for legacy-ssh - by passing
fd 4 to the SSH wrapper. Now you'd get the following error:
cannot build on 'ssh-ng://example.org': error: failed to start SSH connection to 'example.org': Host key verification failed.
Failed to find a machine for remote build!
[...]
...and now it's clear what's wrong.
Please note that this is won't end up in the derivation's log.
For previous discussion about this change see
https://github.com/NixOS/nix/pull/7659.
Change-Id: I5790856dbf58e53ea3e63238b015ea06c347cf92
only decompress the response once all data has been received (in the
fully buffered case), or at least outside of the curl wrapper itself
(in the receive-to-sink case). unfortunately this means we will have
to duplicate decompression logic for these two cases for time being,
but once the curl wrapper has been rewritten to return a real future
or Source we can deduplicate this logic again. the curl wrapper will
have to turn into a proper Source first and use decompression source
logic which also does not currently exist—only decompression *sinks*
Change-Id: I66bc692f07d9b9e69fe10689ee73a2de8d65e35c
this is highly questionable. single-arg download calls will misbehave
with it set, and two-arg download calls will just overwrite it. being
an implementation detail this should not have been in the API at all.
Change-Id: I613772951ee03d8302366085f06a53601d13f132
this lets each implementation of FileTransfer (of which currently only
the one exists at all) implement appropriate handling for its internal
behaviours that are not otherwise exposed. in curl this lets us switch
the buffer-full handling method from "block the entire curl thread" to
"pause just the one transfer", move the non-libcurl body decompression
out of the actual curl wrapper (which will let us eventually morph the
curl wrapper intto an actual source of Sources), and some other things
Change-Id: Id6d3593cde6b4915aab3e90a43b175c103cc3f18
Previously, the garbage collector found runtime roots on Darwin by
shelling out to `lsof -n -w -F n` then parsing the result.
However, this requires an lsof binary and can be extremely slow.
The official Apple lsof returns in a reasonable amount of time,
about 250ms in my tests, but the lsof packaged in nixpkgs is quite slow,
taking about 40 seconds to run the command.
Using libproc directly is about the same speed as Apple lsof,
and allows us to reënable several tests that were disabled on Darwin.
Change-Id: Ifa0adda7984e13c15535693baba835aae79a3577
just accumulate error data into result.data as we would for successful
transfers without a dataCallback. errorSink and data would contain the
same data in error cases anyway, so splitting them is not very useful.
Change-Id: I00e449866454389ac6a564ab411c903fd357dabf
This creates new subclasses of LocalStore for each OS to include
platform-specific functionality. Currently this just includes garbage
collector roots but it could be extended to sandboxing as well.
In order to make sure that the generic LocalStore is not accidentally
constructed, its constructor is protected. A Fallback is provided which
implements no functionality except constructors.
Change-Id: I836a28e90b68309873f75afb83e0f1b2e2c89fb3
This should fix cross compilation in the base case, but this is
difficult to test as cross compilation is broken in many different
places right now. This should bring Meson back up to cross parity with
the Make buildsystem though.
Change-Id: If09be8142d1fc975a82b994143ff35be1297dad8
don't reimplement header parsing. this was only really needed due to the
ancient github bug we no longer care about, everything else we have done
in custom code can also be done using curl itself. doing this also fixes
possible sources of header smuggling (because the header function didn't
unfold headers and we'd trim them before parsing, which would've made us
read contents of one header as a fully formed header in itself). this is
a slight behavior change because we now honor only the first instance of
a given header where previous behavior was to honor either the last or a
combination of all of them (accept-ranges was logical-or'd by accident).
Change-Id: I93cb93ddb91ab98c8991f846014926f6ef039fdb
this was a workaround for a *github* bug that happend *in 2015*.
not only is github no longer buggy, it shouldn't have been nix's
responsibility to work around these bugs like this to begin with
while we're at it we'll also remove another workaround—again for
github specifically and again for etag handling—from 2021 that's
also not needed any more. future workarounds for serverside bugs
should probably come with an expiration date that mutates into a
build warning after a while, otherwise this *will* happen again.
Change-Id: I74f739ae3e36d40350f78bebcb5869aa8cc9adcd
the previous solution to the wakeup problem (adding a pipe and passing
it as an additional fd to curl_multi_wait) worked, but there have been
builtin alternatives for this since 2020. not only do these save code,
they're also a lot more likely to work natively on windows when needed
Change-Id: Iab751b900997110a8d15de45ea3ab0c42f7e5973
the oldest version checked for here is 7.47, which was released in
2016. it's probably safe to say that we do not need these any more
Change-Id: I003411f6b2ce6d56f7ca337390df3ea86bd59a99
With Nix 2.3, it was possible to pass a subpath of a store path to
exportReferencesGraph:
with import <nixpkgs> {};
let
hello = writeShellScriptBin "hello" ''
echo ${toString builtins.currentTime}
'';
in
writeClosure [ "${hello}/bin/hello" ]
This regressed with Nix 2.4, with a very confusing error message, that
presumably indicates it was unintentional:
error: path '/nix/store/3gl7kgjr4pwf03f0x70dgx9ln3bhl7zc-hello/bin/hello' is not in the Nix store
(cherry picked from commit 0774e8ba33c060f56bad3ff696796028249e915a)
Change-Id: I00920fb33077b831a1bb4a1b68d515ba8c3c2a69
The statically embedded busybox is not required for Lix to work, but
package.nix explicitly sets this, which was accidentally being ignored.
Change-Id: Ieeff830ac7d1f5fabe84d1a6cfd82f13d79035bf
Either the contents of `line` could cause format errors, or this usage
is Technically safe. However, I trust nothing, especially with
boost::format.
Change-Id: I07933b20bde3b305a6e5d61c2a7bab6ecb042ad9
Previously if isStorePath() was called on anything other than a
top-level /nix/store/some-path, it would throw a BadStorePath exception.
This commit duplicates the absolutely trivial check, into
maybeParseStorePath(), and leaves exception throwing to
parseStorePath(), the function that assumes you're already giving a
valid path instead of the one whose purpose is to check if its valid or
not...
Change-Id: I8dda548f0f88d14ca8c3ee927d64e0ec0681fc7b
Saves us a bunch of thinking about how to handle symlinks, and prevents
the DNS config from changing on the fly under the build, which may or may
not be a good thing?
Change-Id: I071e6ae7e220884690b788d94f480866f428db71
this should be a link, not an anchor. it should also point to the
`gloss-store` element, not the `#gloss-store` element.
Change-Id: I1f2803093179549637e10f917ad73399a419131b
Instead of $sysconfdir.
Fixes#231, but there's more to do in following commits to make
Meson-built Lix actually look in /etc/nix.
Change-Id: Ia8d627070f405843add46e05cff5134b76b8eb48
This reverts commit 491caad6f62c21ffbcdebe662e63ec0f72e6f3a2.
this is not actually legal for nix! throwing exceptions in destructors
is fine, but the way nix is set up we'll end up throwing the exception
we received from the remote *twice* in some cases, and such cases will
cause an immediate terminate without active exception.
Change-Id: I74c46b9f26fd791086e4193ec60eb1deb9a5bb2a
setting this only on exceptions caused by actual fd access is not
sufficient to diagnose all errors (such as SerialisationError) in
some cases. this usually does not have any negative effects since
those errors will end up killing the process in another way. this
is not a reliable assumption though and we should be using proper
error handling (and closing connections more often, preferring to
close over keeping something open that might be in a weird state)
Change-Id: I1b792cd7ad8ba9ff0f6bd174945ab2575ff2208e
the duplication of exception handling was added without justification,
so we can only assume that it was done like this because Finally could
not throw exceptions safely. since this has now been rectified we will
deduplicate this handler code again.
Change-Id: I40721f3378c0fd9f34e2914a16d383f6e2713b40
usage of this flag previously kept connections open much longer than
necessary, and at the same time obscured that a connection was being
dropped when it *was* set. new variable names clarify this somewhat.
Change-Id: I11f6f08f37a5e4dc04ea6c6036ea589154b121c6
it was used incorrectly (not swapped on handle move), only used in one
place (that is now handled with exception handling detection in Handle
itself), and if ever reintroduced should be replaced with a different,
more understandable mechanism (like an explicit dropAsInvalid method).
Change-Id: Ie3e5d5cfa81d335429cb2ee5c3ad85c74a9df17b