this makes WorkResult copyable, and just all around easier to deal with.
in the future we'll need this to let Goal::work() return a promise for a
WorkResult (or even just a Finished) that can be awaited by other goals.
Change-Id: Ic5a1ce04c5a0f8e683bd00a2ed2b77a2e28989c1
this should be done where we're actually trying to build something, not
in the main worker loop that shouldn't have to be aware of such details
Change-Id: I07276740c0e2e5591a8ce4828a4bfc705396527e
This caused an absolute saga which I would not like anyone else to have
to experience. Let's put in a laser targeted error message that
diagnoses this exact problem.
Fixes: #484
Change-Id: I2a79f04aeb4a1b67c10115e5e39501d958836298
I don't know why the AWS sdk disabled it by default. It would be nice
to have test coverage of the s3 store or proxies, but neither currently
exist.
Fixes: #433
Change-Id: If1e76169a3d66dbec2e926af0d0d0eccf983b97b
There have been multiple setting types for paths that are supposed to be
canonicalised, depending on whether zero or one, one, or any number of paths is
to be specified. Naturally, they behaved in slightly different ways in the
code. Simplify things by unifying them and removing special behaviour (mainly
the "multiple paths type can coerce to boolean" thing).
Change-Id: I7c1ce95e9c8e1829a866fb37d679e167811e9705
this can be a proper WorkResult now. childTerminated is unfortunately a
lot more stubborn and won't be made private for quite a while yet. once
we can get rid of the Worker poll loop that *should* be possible though
Change-Id: I2218df202da5cb84e852f6a37e4c20367495b617
we'll need this once we want to pass extra information out of accepting
replies, such as fd sets or possibly even async output reader promises.
Change-Id: I5e2f18cdb80b0d2faf3067703cc18bd263329b3f
don't keep fds open we're not using. currently this does not cause any
problems, but it does increase the size of our fd table needlessly and
in the future, when we have proper async processing, having builderOut
open in the daemon once the hook has been fully started is problematic
Change-Id: I6e7fb773b280b042873103638d3e04272ca1e4fc
this is useless to do on the face of it, but it'll make it easier to
convert the entire output handling to use async io and promises soon
Change-Id: I2d1eb62c4bbf8f57bd558b9599c08710a389b1a8
only DerivationGoal can set the hook to anything at all. it always sets
buildOutFD to something that is not related to fromHook in any way, and
mixing the two would have rather dire consequences for log consistency.
Change-Id: Ida86727fd1cd5e1ecd78f07f3bde330a346658a8
all derivation goals need a log fd of some description. let's save this
single fd in a dedicated pointer field for all subclasses so that later
we have just the one spot to change if we turn this into async promises
Change-Id: If223adf90909247363fb823d751cae34d25d0c0b
we don't need to expose information about how busy a Worker is if the
worker can instead tell its work items whether they are in a slot. in
the future we might use this to not start items waiting for a slot if
no slots are currently available, but that requires more preparation.
Change-Id: Ibe01ac536da7e6d6f80520164117c43e772f9bd9
this is only used to close non-stdio files in derivation sandboxes. we
may as well encode that in its name, drop the unnecessary integer set,
and use close_range to deal with the actual closing of files. not only
is this clearer, it also makes sandbox setup on linux fast by 1ms each
Change-Id: Id90e259a49c7bc896189e76bfbbf6ef2c0bcd3b2
implementing a build hook is pretty much impossible without either being
a nix, or blindly forwarding the important bits of all build requests to
some kind of nix. we've found no uses of build-hook in the wild, and the
build-hook protocol (apart from being entirely undocumented) is not able
to convey any kind of versioning information between hook and daemon. if
we want to upgrade this infrastructure (which we do), this must not stay
Change-Id: I1ec4976a35adf8105b8ca9240b7984f8b91e147e
* changes:
sqlite: add a Use::fromStrNullable
util: implement charptr_cast
tree-wide: fix a pile of lints
refactor: make HashType and Base enum classes for type safety
build: integrate clang-tidy into CI
There were several usages of the raw sqlite primitives along with C
style casts, seemingly because nobody thought to use an optional for
getting a string or NULL.
Let's fix this API given we already *have* a wrapper.
Change-Id: I526cceedc2e356209d8fb62e11b3572282c314e8
This:
- Converts a bunch of C style casts into C++ casts.
- Removes some very silly pointer subtraction code (which is no more or
less busted on i686 than it began)
- Fixes some "technically UB" that never had to be UB in the first
place.
- Makes finally follow the noexcept status of the inner function. Maybe
in the future we should ban the function from not being noexcept, but
that is not today.
- Makes various locally-used exceptions inherit from std::exception.
Change-Id: I22e66972602604989b5e494fd940b93e0e6e9297
This has been causing various seemingly spurious CI failures as well as
some failures on people running tests on beta builds.
lix> ++(nix-collect-garbage-dry-run.sh:20) nix-store --gc --print-dead
lix> ++(nix-collect-garbage-dry-run.sh:20) wc -l
lix> finding garbage collector roots...
lix> error: Listing pid 87261 file descriptors: Undefined error: 0
There is no real way to write a proper test for this, other than to
start a process like the following:
int main(void) {
for (int i = 0; i < 1000; ++i) {
close(i);
}
sleep(10000);
}
and then let Lix's gc look at it.
I have a relatively high confidence this *will* fix the problem since I
have manually confirmed the behaviour of the libproc call is
as-unexpected, and it would perfectly explain the observed symptom.
Fixes: #446
Change-Id: I67669b98377af17895644b3bafdf42fc33abd076
* changes:
tree-wide: fix various lint warnings
flake & doxygen: update tagline
nix flake metadata: print modified dates for input flakes
cli: eat terminal codes from stdout also
Implement forcing CLI colour on, and document it better
manual: fix a syntax error in redirects.js that made it not do anything
misc docs/meson tidying
build: implement clang-tidy using our plugin
The growth of the seccomp filter in 127ee1a101
made its compilation time significant (roughly 10 milliseconds have been
measured on one machine). For this reason, it is now precompiled and cached in
the parent process so that this overhead is not hit for every single build. It
is still not optimal when going through the daemon, because compilation still
happens once per client, but it's better than before and doing it only once for
the entire daemon requires excessive crimes with the current architecture.
Fixes: #461
Change-Id: I2277eaaf6bab9bd74bbbfd9861e52392a54b61a3
This is a preparation for precompiling the filter, which is done separately.
The behaviour should be unchanged for now.
Change-Id: I899aa7242962615949208597aca88913feba1cb8
The seccomp setup code was a huge chunk of conditionally compiled
platform-specific code. For this reason, it is appropriate to move it to the
platform-specific implementation file. Ideally its setup could be moved a bit
to make it happen at the same place as the Darwin restrictions, but that change
is going to be less mechanical.
Change-Id: I496aa3c4fabf34656aba1e32b0089044ab5b99f8
this begins a long and arduous journey to remove all result state from
Goal, to eventually drop the std::enable_shared_from_this base, and to
completely eliminate all unsynchronized modification of states of both
Goal and Worker. by the end of this we will hopefully be able to start
and reap multiple derivation builds in parallel, which should speed up
the process quite a bit (at least for short local builds, others might
not notice a large difference. the build hooks will remain a problem.)
Change-Id: I57dcd9b2cab4636ed4aa24cdec67124fef883345
In the SSH code, the logger was conditionally paused, but unconditionally
resumed. This was fine as long as resuming the logger was idempotent. Starting
with 0dd1d8ca1c, it isn't any more, and the
behaviour of the code in question was missed. Consequently, an assertion
failure is triggered for example when performing builds against an "SSH" store
on localhost. Fix the issue by only resuming the logger when it has actually
been paused.
Fixes: #458
Change-Id: Ib1e4d047744a129f15730b7216f9c9368c2f4211
we still mutate goal state to store the results of any given goal run,
but now we also have that information in Worker and could in theory do
something else with it. we could return a map of goal to goal results,
which would also let us better diagnose failures of subgoals (at all).
Change-Id: I1df956bbd9fa8cc9485fb6df32918d68dda3ff48
this is the first step towards removing all result-related mutation of
Goal state from goal implementations themselves, and into Worker state
instead. once that is done we can treat all non-const Goal fields like
private state of the goal itself, and make threading of goals possible
Change-Id: I69ff7d02a6fd91a65887c6640bfc4f5fb785b45c
once goals run on multiple threads these fields must by synchronized as
one, or we try to run build hooks to often (or worse, not often enough)
Change-Id: I47860e46fe5c6db41755b2a3a1d9dbb5701c4ca4
there are no other uses for this yet, but asking for just a subset of
outputs does seem at least somewhat useful to have as a generic thing
Change-Id: I30ff5055a666c351b1b086b8d05b9d7c9fb1c77a
limiting CA substitutions was a rather recent addition, and it used a
dedicated counter to not interfere with regular substitutions. though
this works fine it somewhat contradicts the documentation; job limits
should apply to all kinds of substitutions, or be one limit for each.
Change-Id: I1505105b14260ecc1784039b2cc4b7afcf9115c8
all goals do this. it makes no sense to not notify a goal of EOF
conditions because this is the universal signal for "child done"
Change-Id: Ic3980de312547e616739c57c6248a8e81308b5ee
just update progress every time a goal has returned from work(). there
seem to be no performance penalties, and the code is much simpler now.
Change-Id: I288ee568b764ee61f40a498d986afda49987cb50
bindPath/doBind is a useful function in build that is used in several
parts of LocalDerivationGoal. Moving this function makes it easier to
split LocalDerivationGoal implementation between several files.
Change-Id: Ic5a0768479c153c1aa3ed425f12604b20bbf0f42
Unfortunately, io_uring is totally opaque to seccomp, and while currently there
are no dangerous operations implemented, there is no guarantee that it remains
this way. This means that io_uring should be blocked entirely to ensure that
the sandbox is future-proof. This has not been observed to cause issues in
practice.
Change-Id: I45d3895f95abe1bc103a63969f444c334dbbf50d
Previously, system call filtering (to prevent builders from storing files with
setuid/setgid permission bits or extended attributes) was performed using a
blocklist. While this looks simple at first, it actually carries significant
security and maintainability risks: after all, the kernel may add new syscalls
to achieve the same functionality one is trying to block, and it can even be
hard to actually add the syscall to the blocklist when building against a C
library that doesn't know about it yet. For a recent demonstration of this
happening in practice to Nix, see the introduction of fchmodat2 [0] [1].
The allowlist approach does not share the same drawback. While it does require
a rather large list of harmless syscalls to be maintained in the codebase,
failing to update this list (and roll out the update to all users) in time has
rather benign effects; at worst, very recent programs that already rely on new
syscalls will fail with an error the same way they would on a slightly older
kernel that doesn't support them yet. Most importantly, no unintended new ways
of performing dangerous operations will be silently allowed.
Another possible drawback is reduced system call performance due to the larger
filter created by the allowlist requiring more computation [2]. However, this
issue has not convincingly been demonstrated yet in practice, for example in
systemd or various browsers. To the contrary, it has been measured that the the
actual filter constructed here has approximately the same overhead as a very
simple filter blocking only one system call.
This commit tries to keep the behavior as close to unchanged as possible. The
system call list is in line with libseccomp 2.5.5 and glibc 2.39, which are the
latest versions at the point of writing. Since libseccomp 2.5.5 is already a
requirement and the distributions shipping this together with older versions of
glibc are mostly not a thing any more, this should not lead to more build
failures any more.
[0] https://github.com/NixOS/nixpkgs/issues/300635
[1] https://github.com/NixOS/nix/issues/10424
[2] https://github.com/flatpak/flatpak/pull/4462#issuecomment-1061690607
Change-Id: I541be3ea9b249bcceddfed6a5a13ac10b11e16ad
In f047e4357b, I missed the behavior that if
building without a dedicated build user (i.e. in single-user setups), seccomp
setup failures are silently ignored. This was introduced without explanation 7
years ago (ff6becafa8). Hopefully the only
use-case nowadays is causing spurious test suite successes when messing up the
seccomp filter during development. Let's try removing it.
Change-Id: Ibe51416d9c7a6dd635c2282990224861adf1ceab
Use libprocstat to find garbage collector roots on FreeBSD.
Tested working on a FreeBSD machine, although there is no CI yet
Change-Id: Id36bac8c3de6cc4de94e2d76e9663dd4b76068a9
Error is pretty large, and most goals do not fail. this alone more than
halves the size of Goal on x86_64-linux, from 720 bytes down to 344. in
derived classes the difference is not as dramatic, but even the largest
derived class (`LocalDerivationGoal`) loses almost 20% of its footprint
Change-Id: Ifda8f94c81b6566eeb3e52d55d9796ec40c7bce8
the goals are either already using std::async and merely forgot to
remove std::thread vestiges or they emulate async with threads and
promises. we can simply use async directly everywhere for clarity.
Change-Id: I3f05098310a25984f10fff1e68c573329002b500
under owner_less it's equivalent to insert(), only sometimes a little
bit faster because it does not construct a weak_ptr if the goal is in
the set already. this small difference in performance does not matter
here and c++23 will make insert transparent anyway, so we can drop it
Change-Id: I7cbd7d6e0daa95d67145ec58183162f6c4743b15
*accidentally* overriding a function is almost guaranteed to be an
error. overriding a function without labeling it as such is merely
bad style, but bad style that makes the code harder to understand.
Change-Id: Ic0594f3d1604ab6b3c1a75cb5facc246effe45f0
Due to a leftover from a previous version where the buffer was allocated on the
stack, the change introduced in commit 4ec87742a1
accidentally passes the size of a pointer as the size of the buffer to the
decompressor. Since the former is much smaller (usually 8 bytes instead of 64
kilobytes), this is safe, but leads to considerable overhead; most notably, due
to excessive progress reports, which happen for each chunk. Pass the proper
buffer size instead.
Change-Id: If4bf472d33e21587acb5235a2d99e3cb10914633
SimpleLogger is not fully thread-safe, and all loggers that wrap it are
also not safe accordingly. this does not affect much, but in rare cases
it can cause interleaving of messages on stderr when used with the json
or raw log formats. the fix applied here is a bit of a hack, but fixing
this properly requires rearchitecting the logger infrastructure. nested
loggers are not the most natural abstraction here, and it is biting us.
Change-Id: Ifbf34fe1e85c60e73b59faee50e7411c7b5e7c12
If useChroot = false, and user namespaces aren't available for some
reason (e.g. within a Docker container), this fixes a pointless warning
being emitted, as we would never attempt to use them even if they were
available.
Change-Id: Ibcee91c088edd2cd19e70218d5a5802bff8f537b
This removes a *whole load* of variables from scope and enforces thread
boundaries with the type system.
There is not much change of significance in here, so the things to watch
out for while reviewing it are primarily that the destructor ordering
may have changed inadvertently, I think.
Change-Id: I3cd87e6d5a08dfcf368637407251db22a8906316
this is cursed. deeply and profoundly cursed. under NO CIRCUMSTANCES
must protocol serializer helpers be applied to temporaries! doing so
will inevitably cause dangling references and cause the entire thing
to crash. we need to do this even so to get rid of boost coroutines,
and likewise to encapsulate the serializers we suffer today at least
a little bit to allow a gradual migration to an actual IPC protocol.
(this isn't a problem that's unique to generators. c++ coroutines in
general cannot safely take references to arbitrary temporaries since
c++ does not have a lifetime system that can make this safe. -sigh-)
Change-Id: I2921ba451e04d86798752d140885d3c5cc08e146
this doesn't have a test because this code path is only reached by
clients that predate 2.4, and we really should not be caring about
those any more right now. even the test suite doesn't, and the few
tests that might care are disabled because they will not even work
Change-Id: Id9eb190065138fedb2c7d90c328ff9eb9d97385b
This also bans various sneaking of negative numbers from the language
into unsuspecting builtins as was exposed while auditing the
consequences of changing the Nix language integer type to a newtype.
It's unlikely that this change comprehensively ensures correctness when
passing integers out of the Nix language and we should probably add a
checked-narrowing function or something similar, but that's out of scope
for the immediate change.
During the development of this I found a few fun facts about the
language:
- You could overflow integers by converting from unsigned JSON values.
- You could overflow unsigned integers by converting negative numbers
into them when going into Nix config, into fetchTree, and into flake
inputs.
The flake inputs and Nix config cannot actually be tested properly
since they both ban thunks, however, we put in checks anyway because
it's possible these could somehow be used to do such shenanigans some
other way.
Note that Lix has banned Nix language integer overflows since the very
first public beta, but threw a SIGILL about them because we run with
-fsanitize=signed-overflow -fsanitize-undefined-trap-on-error in
production builds. Since the Nix language uses signed integers, overflow
was simply undefined behaviour, and since we defined that to trap, it
did.
Trapping on it was a bad UX, but we didn't even entirely notice
that we had done this at all until it was reported as a bug a couple of
months later (which is, to be fair, that flag working as intended), and
it's got enough production time that, aside from code that is IMHO buggy
(and which is, in any case, not in nixpkgs) such as
#445, we don't think
anyone doing anything reasonable actually depends on wrapping overflow.
Even for weird use cases such as doing funny bit crimes, it doesn't make
sense IMO to have wrapping behaviour, since two's complement arithmetic
overflow behaviour is so *aggressively* not what you want for *any* kind
of mathematics/algorithms. The Nix language exists for package
management, a domain where bit crimes are already only dubiously in
scope to begin with, and it makes a lot more sense for that domain for
the integers to never lose precision, either by throwing errors if they
would, or by being arbitrary-precision.
This change will be ported to CppNix as well, to maintain language
consistency.
Fixes: #423
Change-Id: I51f253840c4af2ea5422b8a420aa5fafbf8fae75
The actual motive here is the avoidance of integer overflow if we were
to make these use checked NixInts and retain the subtraction.
However, the actual *intent* of this code is a three-way comparison,
which can be done with operator<=>, so we should just do *that* instead.
Change-Id: I7f9a7da1f3176424b528af6d1b4f1591e4ab26bf
upcast_goal was only ever needed to break circular includes, but the
same solution that gave us upcast_goal also lets us fully remove it:
just upcast goals without a wrapper function, but only in .cc files.
Change-Id: I9c71654b2535121459ba7dcfd6c5da5606904032
the sole remaining user of this function can use makeDecompressionSource
instead, while making the sinkToSource in the caller unnecessary as well
Change-Id: I4258227b5dbbb735a75b477d8a57007bfca305e9
the rewriting sink was just broken. when given a rewrite set that
contained a key that is also a proper infix of another key it was
possible to produce an incorrectly rewritten result if the writer
used the wrong block size. fixing this duplicates rewriteStrings,
to avoid this we'll rewrite rewriteStrings to use RewritingSource
in a new mode that'll allow rewrites we had previously forbidden.
Change-Id: I57fa0a9a994e654e11d07172b8e31d15f0b7e8c0
This rather simple function existed just to check some flags,
but the response varies by platform. This is a perfect case for
our subclasses.
Change-Id: Ieb1732a8d024019236e0d0028ad843a24ec3dc59
this much more closely mimics what is actually happening: we're reading
data from somewhere else, actively, rather than passively waiting. with
the data flow matching the underlying system interactions better we can
remove a few sinkToSource calls that merely exists to undo the mismatch
caused by not treating subprocess output as a data source to begin with
Change-Id: If4abfc2f8398fb5e88c9b91a8bdefd5504bb2d11
this will let us also return a source for the program output later,
which will in turn make sinkToSource unnecessary for program output
processing. this may also reopen a path for provigin program input,
but that still needs a proper async io framework to avoid problems.
Change-Id: Iaf93f47db99c38cfaf134bd60ed6a804d7ddf688