not doing this can freeze slots until the goal that occupied them is
freed (rather than simply complete), and then can freeze the system.
fixes#549
Change-Id: I042df04222f8ffbaa18ef4a4eae6cbd6f89b679e
Daemon client handler processes are forked off of the main nix process
and thus will not have a signal handler thread anymore. This leads to a
high likelihood of bustage, since the Worker infrastructure expects the
interrupt infrastructure to actually, you know, work, to be able to get
interrupted.
The expected behaviour after fork is either:
- Start a signal handler thread if you expect to do complicated things
that need ReceiveInterrupts.
- Call restoreProcessContext and don't handle signals specially
otherwise.
Change-Id: I73d36b5bbf96dddd21d5e1c3bd0484d715c00e8b
This is done because the one in nixpkgs has several problems and we
don't want to conflict with them fixing those problems:
- not building shared libs
- not building debuginfo (more critical for us due to
#549)
- not setting the correct cmake build type
- not setting the correct cxxflags to build the coroutine library
- not building with clang (required for the coroutine library since gcc
is known to miscompile coroutines *in kj*).
CC: #551
Fixes: #550
Change-Id: Ia5b78dc1809963fdd1a8203b127a216cb575d751
This workaround was for the stdenv not being set when callPackage'ing
package.nix for some of the stranger CI outputs.
Change-Id: I2acdd6efa721b90dd3cb04358544d25d591ff084
The old behavior results in lots of concatenations happening for no good
reason and is an artifact of the technical limitations of the old parser
(combined with some lack of care for such details).
Change-Id: I0d78d6220ca6aeaa10bc437e48e08bf7922e0bb3
This is only a minor semantical distinction, but we should be able to
properly test it, and the parser tests rely on show for that.
Change-Id: I25e868cf9544e30cdff17deb5fd50a434e0f367e
This commit should faithfully reproduce the old behavior down to the
bugs. The new code is a lot more readable, all quirks are well
documented, and it is overall much more maintainable.
Change-Id: I629585918e4f2b7d296b6b8330235cdc90b7bade
Some settings only make sense on particular platforms, or only when a certain
experimental feature is enabled. Several of those were already conditionally
available. Do the same for a bunch more instead of silently ignoring them.
Exceptionally, the use-case-hack setting is not made conditional because it is
included in the test suite.
Change-Id: I29e66ad8ee6178a7c0eff9efb55c3410fae32514
The build hook is still running locally, so it will run with the same default
settings. Hence, just as with the daemon, it is enough to send it only the
overridden settings. This will prevent warnings like
warning: Ignoring setting 'auto-allocate-uids' because experimental feature 'auto-allocate-uids' is not enabled
when the user didn't actually set those settings.
This is inspired by and an alternative to [0].
[0] https://github.com/NixOS/nix/pull/10049
Change-Id: I77ea62cd017614b16b55979dd30e75f09f860d21
Only overridden settings are sent to the daemon, and we're going to do the same
for the build hook to. It needs to be ensured that overridden settings are in
fact consistently marked as such, so that they actually get sent.
Change-Id: I7cd58d925702f86cf2c35ad121eb191ceb62a355
Before this change, expressions like:
with import <nixpkgs> {};
runCommand "foo" {} ''
echo '@nix {}' >&$NIX_LOG_FD
''
would result in Lix crashing, because accessing nonexistent fields of
a JSON object throws an exception.
Rather than handling each field individually, we just catch JSON
exceptions wholesale. Since these log messages are an unusual
circumstance, log a warning when this happens.
Fixes#544.
Change-Id: Idc2d8acf6e37046b3ec212f42e29269163dca893
By implementing the `PathSet` specialization for `PathsSetting`, we'll
be able to use `PathsSetting` for the `sandboxPaths` setting in
`src/libstore/globals.hh`.
Fixes: #498
Change-Id: I8bf7dfff98609d1774fdb36d63e57d787bcc829f
This failed due to https://github.com/NixOS/nixpkgs/pull/346945, which
makes a second lowdown-unsandboxed that works in nix builds, and the
regular lowdown has executables that fail closed when the sandbox setup
fails.
The actual failure here is only visible on nixos-unstable at the moment,
not 24.05, but this commit should fix it up for all versions.
Fixes: #547
Change-Id: I50c0ecb59518ef01a7c0181114c1b4c5a7c6b78b
This was always a terrible idea independently of whether it crashes.
Stop doing it!
This commit was verified by running nix-shell on a trivial derivation
with --debug --verbose to get the vomit-level output of the shell rc
file and then diffing it before/after this change. I have reasonable
confidence it did not regress anything, though this code is genuinely
really hard to follow (which is a second reason that I split it into two
fmt calls).
Fixes: #533
Change-Id: I8e11ddbece2b12749fda13efe0b587a71b00bfe5
This change feels kind of gross and reveals a fair bit about the
disorganization of our tests, but I think it makes parts of it a bit
better.
Change-Id: Idb8d9a00cbd75d5c156678c6b408b42b59d5e4d7
A better fix than in 104448e75d, hence a
revert + the fix.
It turns out that this commit has the side-effect that when having e.g.
`StrictHostKeyChecking=accept-new` for a remote builder, the warnings à la
Warning: Permanently added 'builder' (ED25519) to the list of known hosts.
actually end up in the derivation's log whereas hostkey verification
errors don't, but only in the stderr of the `nix-build` invocation
(which was the motivation for the patch).
This change writes the stderr from the build-hook to
* the daemon's stderr, so that the SSH errors appear in the journal
(which was the case before 104448e75d)
* the client's stderr, as a log message
* NOT to the drv log (this is handled via `handleJSONLogMessage`)
I tried to fix the issue for legacy-ssh as well, but failed and
ultimately decided to not bother.
I know that we'll sooner or later replace the entire component, however
this is the part of the patch I have working for a while, so I figured I
might still submit it for the time being.
Change-Id: I21ca1aa0d8ae281d2eacddf26e0aa825272707e5
When the git default branch is not set to master the installcheck
test suite fails. This patch adjusts the test setup scripts to
ignore the system and user git config files.
GIT_CONFIG_SYSTEM is set to /dev/null to ignore /etc/gitconfig
GIT_CONFIG_GLOBAL is not set because the global config files
are loaded from $HOME or $XDG_CONFIG_HOME which we already
reset.
git documentation: https://git-scm.com/docs/git#Documentation/git.txt-codeGITCONFIGGLOBALcode
Change-Id: Ie73bbed1db9419c9885b9d57e4edb7a4047d5cce
While debugging something else I observed that latest `main` ignores
`Control-C` on `sudo nix-build`.
After reading through the capnproto docs, it seems as if the promise
must be fulfilled to actually terminate the `promise.wait()` below.
This also applies to scenarios such as stopping the client
(`nix-build`), but the builders on the daemon-side are still running,
i.e. closes#540
Co-authored-by: eldritch horrors <pennae@lix.systems>
Change-Id: I9634d14df4909fc1b65d05654aad0309bcca8a0a
So we received a report that the thread pool crashed due to an
Interrupted exception.
Relevant log tail:
copying path '/nix/store/0kal2k73inviikxv9f1ciaj39lkl9a87-etc-os-release' to 'ssh://192.168.0.27'...
Lix crashed. This is a bug. We would appreciate if you report it along with what caused it at https://git.lix.systems/lix-project/lix/issues with the following information included:
error (ignored): error: interrupted by the user
Exception: nix::Interrupted: error: interrupted by the user
Relevant stack trace:
4# __cxa_rethrow in /nix/store/22nxhmsfcv2q2rpkmfvzwg2w5z1l231z-gcc-13.3.0-lib/lib/libstdc++.so.6
5# nix::ignoreExceptionExceptInterrupt(nix::Verbosity) in /nix/store/ghxr2ykqc3rrfcy8rzdys0rzx9ah5fqj-lix-2.92.0-dev-pre20241005-ed9b7f4/lib/liblixutil.so
6# nix::ThreadPool::doWork(bool) in /nix/store/ghxr2ykqc3rrfcy8rzdys0rzx9ah5fqj-lix-2.92.0-dev-pre20241005-ed9b7f4/lib/liblixutil.so
7# 0x00007FA7A00E86D3 in /nix/store/22nxhmsfcv2q2rpkmfvzwg2w5z1l231z-gcc-13.3.0-lib/lib/libstdc++.so.6
8# 0x00007FA79FE99A42 in /nix/store/3dyw8dzj9ab4m8hv5dpyx7zii8d0w6fi-glibc-2.39-52/lib/libc.so.6
9# 0x00007FA79FF1905C in /nix/store/3dyw8dzj9ab4m8hv5dpyx7zii8d0w6fi-glibc-2.39-52/lib/libc.so.6
Notably, this is *not* in the main thread, so this implies that the
thread didn't get joined properly before their destructors got called.
That, in turn, should have only possibly happened because join() threw
on a previous iteration of the loop joining threads, I think. Or if it
threw while in the ThreadPool destructor. Either way we had better stop
letting Interrupted fall out of our child threads!
If:
- Interrupted was thrown inside the action in the main thread: it would
have fallen out of doWork if state->exception was already set and got
caught by ThreadPool::process, calling shutdown() and the join loop
which would crash the process entirely.
- Interrupted was thrown inside the action on a secondary thread: it
would have been caught and put into the exception field and then
possibly rethrown to fall out of the thread (since it was previously
ignoreExceptionExceptInterrupt).
The one possible hole in this hypothesis is that there is an "error
(ignored)" line in there implying that at least one Interrupted got
eaten by an ignoreExceptionInDestructor. It's also unclear whether this
got reordered because of stderr buffering.
Fixes: #542
Change-Id: I322cf050da660af78f5cb0e08ec6e6d27d09ac76
There is absolutely no good reason these should show up in NARs besides
misconfigured systems and as long as the case hack exists, unpacking
such a NAR will cause its repacking to be wrong on systems with case
hack enabled.
This should not have any security impact on Lix to fix, but it was one
of the vectors for CVE-2024-45593:
https://github.com/NixOS/nix/security/advisories/GHSA-h4vv-h3jq-v493
Change-Id: I85b6075aacc069ee7039240b0f525804a2d8edcb
This is capped at 12 because 3.7 seconds of startup is painful enough
and 5.5 seconds with 24 was more annoying.
Change-Id: I327db40fd98deaa5330cd9cf6de99fb07b2c1cb0
This also rewrites a lot of the command handling in the fixtures
library, since we want to more precisely control which way that the nix
store is set up in the tests, rather than the previous method of
renaming /nix/store to some temp dir (which allows builds but does not
allow any /nix/store paths or stability across runs, which is a
significant issue for snapshot testing).
It uses a builder to reduce the amount of state carelessly thrown
around.
The evil NARs are inspired by CVE-2024-45593
(https://github.com/NixOS/nix/security/advisories/GHSA-h4vv-h3jq-v493).
No bugs were found in this endeavor.
Change-Id: Iee41b055fa96529c5a3c761f680ed1d0667ba5da
I am tired of bad shell scripts, let me write bad python quickly
instead. It's definitely, $100%, better.
This is not planned as an immediate replacement of the old test suite,
but we::jade would not oppose tests getting ported.
What is here is a mere starting point and there is a lot more
functionality that we need.
Fixes: #488
Change-Id: If762efce69030bb667491b263b874c36024bf7b6
I followed @pennae's advice and moved the constructor definition of
`AttrName` from the header file `nixexpr.hh` to `nixexpr.cc`.
Change-Id: I733f56c25635b366b11ba332ccec38dd7444e793
The approach that was taken here was to add default values to the type
definitions rather than specify them whenever they are missing.
Now the only remaining warning is '-Wunused-parameter' which @jade said
is usually counterproductive and that we can just disable it:
#456 (comment)
So this change adds the flags '-Wall', '-Wextra' and
'-Wno-unused-parameter', so that all warnings are enabled except for
'-Wunused-parameter'.
Change-Id: Ic223a964d67ab429e8da804c0721ba5e25d53012
There was a bug report about a potential call to `memcpy` with a null
pointer which is not reproducible:
#492
This occurred in `src/libstore/filetransfer.cc` in `InnerSource::read`.
To ensure that this doesn't happen, an early return is added before
calling `memcpy` if the length of the data to be copied is 0.
This change also adds a test that ensures that when `InnerSource::read`
is called with an empty file, it throws an `EndOfFile` exception.
Change-Id: Ia18149bee9a3488576c864f28475a3a0c9eadfbb
these two functions are now nearly trivial and much better inline into
makeGoalCommon. keeping them separate also separates information about
goal completion flows and how failure information ends up in `Worker`.
Change-Id: I6af86996e4a2346583371186595e3013c88fb082
we can use our newfound powers of Goal::work Is A Real Promise to remove
completed goals from continuation promises. apart from being much easier
to follow it's also a lot more efficient because we have the iterator to
the item we are trying to remove, skipping a linear search of the cache.
Change-Id: Ie0190d051c5f4b81304d98db478348b20c209df5
Goal::work() is a fully usable promise that does not rely on the worker
to report completion conditions. as such we no longer need the `notify`
field that enabled this interplay. we do have to clear goal caches when
destroying the worker though, otherwise goal promises may (incorrectly)
keep goals alive due to strong shared pointers created by childStarted.
Change-Id: Ie607209aafec064dbdf3464fe207d70ba9ee158a
derivation goals still hold a BuildResult member variable since parts of
these results of accumulated in different places, but the Goal class now
no longer has such a field. substitution goals don't need it at all, and
derivation goals should also be refactored to not drop their buildResult
Change-Id: Ic6d3d471cdbe790a6e09a43445e25bedec6ed446