it just makes sense to have it too, rather than just the pass/fail
information we keep so far. once we turn goals into something more
promise-shaped it'll also help detangle the current data flow mess
Change-Id: I915cf04d177cad849ea7a5833215d795326f1946
it doesn't have a purpose except cache priming, which is largely
irrelevant by default (since another code path already runs this
exact query). our store implementations do not benefit that much
from this either, and the more bursty load may indeed harm them.
Change-Id: I1cc12f8c21cede42524317736d5987f1e43fc9c9
updating statistics *immediately* when any counter changes declutters
things somewhat and makes useful status reports less dependent on the
current worker main loop. using callbacks will make it easier to move
the worker loop into kj entirely, using only promises for scheduling.
Change-Id: I695dfa83111b1ec09b1a54cff268f3c1d7743ed6
there's no reason to go through the event loop in these cases. returning
ContinueImmediately here is just a very convoluted way of jumping to the
state we've just set after unwinding one frame of the stack, which never
matters in the cases changed here because there are no live RAII guards.
Change-Id: I7c00948c22e3caf35e934c1a14ffd2d40efc5547
this is not ideal, but it's better than having this stuck in the worker
loop itself. setting ex on all failing goals is not problematic because
only toplevel goals can ever be observable, all the others are ignored.
notably only derivation goals ever set `ex`, substitution goals do not.
Change-Id: I02e2164487b2955df053fef3c8e774d557aa638a
this doesn't serve a great purpose yet except to confine construction of
goals to the stack frame of Worker::run() and its child frames. we don't
need this yet (and the goal constructors remain fully visible), but in a
future change that fully removes the current worker loop we'll need some
way of knowing which goals are top-level goals without passing the goals
themselves around. once that's possible we can remove visible goals as a
concept and rely on build result futures and a scheduler built upon them
Change-Id: Ia73cdeffcfb9ba1ce9d69b702dc0bc637a4c4ce6
whether goal errors are reported via the `ex` member or just printed to
the log depends on whether the goal is a toplevel goal or a dependency.
if goals are aware of this themselves we can move error printing out of
the worker loop, and since a running worker can only be used by running
goals it's totally sufficient to keep a `Worker::running` flag for this
Change-Id: I6b5cbe6eccee1afa5fde80653c4b968554ddd16f
Fixes:
- Identifiers starting with _ are prohibited
- Some driveby header dependency cleaning which wound up with doing some
extra fixups.
- Fucking C style casts, man. C++ made these 1000% worse by letting you
also do memory corruption with them with references.
- Remove casts to Expr * where ExprBlackHole is an incomplete type by
introducing an explicitly-cast eBlackHoleAddr as Expr *.
- An incredibly illegal cast of the text bytes of the StorePath hash
into a size_t directly. You can't DO THAT.
Replaced with actually parsing the hash so we get 100% of the bits
being entropy, then memcpying the start of the hash. If this shows
up in a profile we should just make the hash parser faster with a
lookup table or something sensible like that.
- This horrendous bit of UB which I thankfully slapped a deprecation
warning on, built, and it didn't trigger anywhere so it was dead
code and I just deleted it. But holy crap you *cannot* do that.
inline void mkString(const Symbol & s)
{
mkString(((const std::string &) s).c_str());
}
- Some wrong lints. Lots of wrong macro lints, one wrong
suspicious-sizeof lint triggered by the template being instantiated
with only pointers, but the calculation being correct for both
pointers and not-pointers.
- Exceptions in destructors strike again. I tried to catch the
exceptions that might actually happen rather than all the exceptions
imaginable. We can let the runtime hard-kill it on other exceptions
imo.
Change-Id: I71761620846cba64d66ee7ca231b20c061e69710
this makes WorkResult copyable, and just all around easier to deal with.
in the future we'll need this to let Goal::work() return a promise for a
WorkResult (or even just a Finished) that can be awaited by other goals.
Change-Id: Ic5a1ce04c5a0f8e683bd00a2ed2b77a2e28989c1
this should be done where we're actually trying to build something, not
in the main worker loop that shouldn't have to be aware of such details
Change-Id: I07276740c0e2e5591a8ce4828a4bfc705396527e
This caused an absolute saga which I would not like anyone else to have
to experience. Let's put in a laser targeted error message that
diagnoses this exact problem.
Fixes: lix-project/lix#484
Change-Id: I2a79f04aeb4a1b67c10115e5e39501d958836298
There have been multiple setting types for paths that are supposed to be
canonicalised, depending on whether zero or one, one, or any number of paths is
to be specified. Naturally, they behaved in slightly different ways in the
code. Simplify things by unifying them and removing special behaviour (mainly
the "multiple paths type can coerce to boolean" thing).
Change-Id: I7c1ce95e9c8e1829a866fb37d679e167811e9705
this can be a proper WorkResult now. childTerminated is unfortunately a
lot more stubborn and won't be made private for quite a while yet. once
we can get rid of the Worker poll loop that *should* be possible though
Change-Id: I2218df202da5cb84e852f6a37e4c20367495b617
we'll need this once we want to pass extra information out of accepting
replies, such as fd sets or possibly even async output reader promises.
Change-Id: I5e2f18cdb80b0d2faf3067703cc18bd263329b3f
don't keep fds open we're not using. currently this does not cause any
problems, but it does increase the size of our fd table needlessly and
in the future, when we have proper async processing, having builderOut
open in the daemon once the hook has been fully started is problematic
Change-Id: I6e7fb773b280b042873103638d3e04272ca1e4fc
this is useless to do on the face of it, but it'll make it easier to
convert the entire output handling to use async io and promises soon
Change-Id: I2d1eb62c4bbf8f57bd558b9599c08710a389b1a8
only DerivationGoal can set the hook to anything at all. it always sets
buildOutFD to something that is not related to fromHook in any way, and
mixing the two would have rather dire consequences for log consistency.
Change-Id: Ida86727fd1cd5e1ecd78f07f3bde330a346658a8
all derivation goals need a log fd of some description. let's save this
single fd in a dedicated pointer field for all subclasses so that later
we have just the one spot to change if we turn this into async promises
Change-Id: If223adf90909247363fb823d751cae34d25d0c0b
we don't need to expose information about how busy a Worker is if the
worker can instead tell its work items whether they are in a slot. in
the future we might use this to not start items waiting for a slot if
no slots are currently available, but that requires more preparation.
Change-Id: Ibe01ac536da7e6d6f80520164117c43e772f9bd9
this is only used to close non-stdio files in derivation sandboxes. we
may as well encode that in its name, drop the unnecessary integer set,
and use close_range to deal with the actual closing of files. not only
is this clearer, it also makes sandbox setup on linux fast by 1ms each
Change-Id: Id90e259a49c7bc896189e76bfbbf6ef2c0bcd3b2
implementing a build hook is pretty much impossible without either being
a nix, or blindly forwarding the important bits of all build requests to
some kind of nix. we've found no uses of build-hook in the wild, and the
build-hook protocol (apart from being entirely undocumented) is not able
to convey any kind of versioning information between hook and daemon. if
we want to upgrade this infrastructure (which we do), this must not stay
Change-Id: I1ec4976a35adf8105b8ca9240b7984f8b91e147e
* changes:
sqlite: add a Use::fromStrNullable
util: implement charptr_cast
tree-wide: fix a pile of lints
refactor: make HashType and Base enum classes for type safety
build: integrate clang-tidy into CI
There were several usages of the raw sqlite primitives along with C
style casts, seemingly because nobody thought to use an optional for
getting a string or NULL.
Let's fix this API given we already *have* a wrapper.
Change-Id: I526cceedc2e356209d8fb62e11b3572282c314e8
This:
- Converts a bunch of C style casts into C++ casts.
- Removes some very silly pointer subtraction code (which is no more or
less busted on i686 than it began)
- Fixes some "technically UB" that never had to be UB in the first
place.
- Makes finally follow the noexcept status of the inner function. Maybe
in the future we should ban the function from not being noexcept, but
that is not today.
- Makes various locally-used exceptions inherit from std::exception.
Change-Id: I22e66972602604989b5e494fd940b93e0e6e9297
This has been causing various seemingly spurious CI failures as well as
some failures on people running tests on beta builds.
lix> ++(nix-collect-garbage-dry-run.sh:20) nix-store --gc --print-dead
lix> ++(nix-collect-garbage-dry-run.sh:20) wc -l
lix> finding garbage collector roots...
lix> error: Listing pid 87261 file descriptors: Undefined error: 0
There is no real way to write a proper test for this, other than to
start a process like the following:
int main(void) {
for (int i = 0; i < 1000; ++i) {
close(i);
}
sleep(10000);
}
and then let Lix's gc look at it.
I have a relatively high confidence this *will* fix the problem since I
have manually confirmed the behaviour of the libproc call is
as-unexpected, and it would perfectly explain the observed symptom.
Fixes: lix-project/lix#446
Change-Id: I67669b98377af17895644b3bafdf42fc33abd076
* changes:
tree-wide: fix various lint warnings
flake & doxygen: update tagline
nix flake metadata: print modified dates for input flakes
cli: eat terminal codes from stdout also
Implement forcing CLI colour on, and document it better
manual: fix a syntax error in redirects.js that made it not do anything
misc docs/meson tidying
build: implement clang-tidy using our plugin
The growth of the seccomp filter in 127ee1a101
made its compilation time significant (roughly 10 milliseconds have been
measured on one machine). For this reason, it is now precompiled and cached in
the parent process so that this overhead is not hit for every single build. It
is still not optimal when going through the daemon, because compilation still
happens once per client, but it's better than before and doing it only once for
the entire daemon requires excessive crimes with the current architecture.
Fixes: lix-project/lix#461
Change-Id: I2277eaaf6bab9bd74bbbfd9861e52392a54b61a3
This is a preparation for precompiling the filter, which is done separately.
The behaviour should be unchanged for now.
Change-Id: I899aa7242962615949208597aca88913feba1cb8
The seccomp setup code was a huge chunk of conditionally compiled
platform-specific code. For this reason, it is appropriate to move it to the
platform-specific implementation file. Ideally its setup could be moved a bit
to make it happen at the same place as the Darwin restrictions, but that change
is going to be less mechanical.
Change-Id: I496aa3c4fabf34656aba1e32b0089044ab5b99f8
this begins a long and arduous journey to remove all result state from
Goal, to eventually drop the std::enable_shared_from_this base, and to
completely eliminate all unsynchronized modification of states of both
Goal and Worker. by the end of this we will hopefully be able to start
and reap multiple derivation builds in parallel, which should speed up
the process quite a bit (at least for short local builds, others might
not notice a large difference. the build hooks will remain a problem.)
Change-Id: I57dcd9b2cab4636ed4aa24cdec67124fef883345
In the SSH code, the logger was conditionally paused, but unconditionally
resumed. This was fine as long as resuming the logger was idempotent. Starting
with 0dd1d8ca1c, it isn't any more, and the
behaviour of the code in question was missed. Consequently, an assertion
failure is triggered for example when performing builds against an "SSH" store
on localhost. Fix the issue by only resuming the logger when it has actually
been paused.
Fixes: lix-project/lix#458
Change-Id: Ib1e4d047744a129f15730b7216f9c9368c2f4211
we still mutate goal state to store the results of any given goal run,
but now we also have that information in Worker and could in theory do
something else with it. we could return a map of goal to goal results,
which would also let us better diagnose failures of subgoals (at all).
Change-Id: I1df956bbd9fa8cc9485fb6df32918d68dda3ff48
this is the first step towards removing all result-related mutation of
Goal state from goal implementations themselves, and into Worker state
instead. once that is done we can treat all non-const Goal fields like
private state of the goal itself, and make threading of goals possible
Change-Id: I69ff7d02a6fd91a65887c6640bfc4f5fb785b45c
once goals run on multiple threads these fields must by synchronized as
one, or we try to run build hooks to often (or worse, not often enough)
Change-Id: I47860e46fe5c6db41755b2a3a1d9dbb5701c4ca4
there are no other uses for this yet, but asking for just a subset of
outputs does seem at least somewhat useful to have as a generic thing
Change-Id: I30ff5055a666c351b1b086b8d05b9d7c9fb1c77a
limiting CA substitutions was a rather recent addition, and it used a
dedicated counter to not interfere with regular substitutions. though
this works fine it somewhat contradicts the documentation; job limits
should apply to all kinds of substitutions, or be one limit for each.
Change-Id: I1505105b14260ecc1784039b2cc4b7afcf9115c8
all goals do this. it makes no sense to not notify a goal of EOF
conditions because this is the universal signal for "child done"
Change-Id: Ic3980de312547e616739c57c6248a8e81308b5ee
just update progress every time a goal has returned from work(). there
seem to be no performance penalties, and the code is much simpler now.
Change-Id: I288ee568b764ee61f40a498d986afda49987cb50
bindPath/doBind is a useful function in build that is used in several
parts of LocalDerivationGoal. Moving this function makes it easier to
split LocalDerivationGoal implementation between several files.
Change-Id: Ic5a0768479c153c1aa3ed425f12604b20bbf0f42
Unfortunately, io_uring is totally opaque to seccomp, and while currently there
are no dangerous operations implemented, there is no guarantee that it remains
this way. This means that io_uring should be blocked entirely to ensure that
the sandbox is future-proof. This has not been observed to cause issues in
practice.
Change-Id: I45d3895f95abe1bc103a63969f444c334dbbf50d
Previously, system call filtering (to prevent builders from storing files with
setuid/setgid permission bits or extended attributes) was performed using a
blocklist. While this looks simple at first, it actually carries significant
security and maintainability risks: after all, the kernel may add new syscalls
to achieve the same functionality one is trying to block, and it can even be
hard to actually add the syscall to the blocklist when building against a C
library that doesn't know about it yet. For a recent demonstration of this
happening in practice to Nix, see the introduction of fchmodat2 [0] [1].
The allowlist approach does not share the same drawback. While it does require
a rather large list of harmless syscalls to be maintained in the codebase,
failing to update this list (and roll out the update to all users) in time has
rather benign effects; at worst, very recent programs that already rely on new
syscalls will fail with an error the same way they would on a slightly older
kernel that doesn't support them yet. Most importantly, no unintended new ways
of performing dangerous operations will be silently allowed.
Another possible drawback is reduced system call performance due to the larger
filter created by the allowlist requiring more computation [2]. However, this
issue has not convincingly been demonstrated yet in practice, for example in
systemd or various browsers. To the contrary, it has been measured that the the
actual filter constructed here has approximately the same overhead as a very
simple filter blocking only one system call.
This commit tries to keep the behavior as close to unchanged as possible. The
system call list is in line with libseccomp 2.5.5 and glibc 2.39, which are the
latest versions at the point of writing. Since libseccomp 2.5.5 is already a
requirement and the distributions shipping this together with older versions of
glibc are mostly not a thing any more, this should not lead to more build
failures any more.
[0] https://github.com/NixOS/nixpkgs/issues/300635
[1] https://github.com/NixOS/nix/issues/10424
[2] https://github.com/flatpak/flatpak/pull/4462#issuecomment-1061690607
Change-Id: I541be3ea9b249bcceddfed6a5a13ac10b11e16ad
In f047e4357b, I missed the behavior that if
building without a dedicated build user (i.e. in single-user setups), seccomp
setup failures are silently ignored. This was introduced without explanation 7
years ago (ff6becafa8). Hopefully the only
use-case nowadays is causing spurious test suite successes when messing up the
seccomp filter during development. Let's try removing it.
Change-Id: Ibe51416d9c7a6dd635c2282990224861adf1ceab
Use libprocstat to find garbage collector roots on FreeBSD.
Tested working on a FreeBSD machine, although there is no CI yet
Change-Id: Id36bac8c3de6cc4de94e2d76e9663dd4b76068a9
Error is pretty large, and most goals do not fail. this alone more than
halves the size of Goal on x86_64-linux, from 720 bytes down to 344. in
derived classes the difference is not as dramatic, but even the largest
derived class (`LocalDerivationGoal`) loses almost 20% of its footprint
Change-Id: Ifda8f94c81b6566eeb3e52d55d9796ec40c7bce8
the goals are either already using std::async and merely forgot to
remove std::thread vestiges or they emulate async with threads and
promises. we can simply use async directly everywhere for clarity.
Change-Id: I3f05098310a25984f10fff1e68c573329002b500
under owner_less it's equivalent to insert(), only sometimes a little
bit faster because it does not construct a weak_ptr if the goal is in
the set already. this small difference in performance does not matter
here and c++23 will make insert transparent anyway, so we can drop it
Change-Id: I7cbd7d6e0daa95d67145ec58183162f6c4743b15
*accidentally* overriding a function is almost guaranteed to be an
error. overriding a function without labeling it as such is merely
bad style, but bad style that makes the code harder to understand.
Change-Id: Ic0594f3d1604ab6b3c1a75cb5facc246effe45f0
Due to a leftover from a previous version where the buffer was allocated on the
stack, the change introduced in commit 4ec87742a1
accidentally passes the size of a pointer as the size of the buffer to the
decompressor. Since the former is much smaller (usually 8 bytes instead of 64
kilobytes), this is safe, but leads to considerable overhead; most notably, due
to excessive progress reports, which happen for each chunk. Pass the proper
buffer size instead.
Change-Id: If4bf472d33e21587acb5235a2d99e3cb10914633
SimpleLogger is not fully thread-safe, and all loggers that wrap it are
also not safe accordingly. this does not affect much, but in rare cases
it can cause interleaving of messages on stderr when used with the json
or raw log formats. the fix applied here is a bit of a hack, but fixing
this properly requires rearchitecting the logger infrastructure. nested
loggers are not the most natural abstraction here, and it is biting us.
Change-Id: Ifbf34fe1e85c60e73b59faee50e7411c7b5e7c12
If useChroot = false, and user namespaces aren't available for some
reason (e.g. within a Docker container), this fixes a pointless warning
being emitted, as we would never attempt to use them even if they were
available.
Change-Id: Ibcee91c088edd2cd19e70218d5a5802bff8f537b
This removes a *whole load* of variables from scope and enforces thread
boundaries with the type system.
There is not much change of significance in here, so the things to watch
out for while reviewing it are primarily that the destructor ordering
may have changed inadvertently, I think.
Change-Id: I3cd87e6d5a08dfcf368637407251db22a8906316
this is cursed. deeply and profoundly cursed. under NO CIRCUMSTANCES
must protocol serializer helpers be applied to temporaries! doing so
will inevitably cause dangling references and cause the entire thing
to crash. we need to do this even so to get rid of boost coroutines,
and likewise to encapsulate the serializers we suffer today at least
a little bit to allow a gradual migration to an actual IPC protocol.
(this isn't a problem that's unique to generators. c++ coroutines in
general cannot safely take references to arbitrary temporaries since
c++ does not have a lifetime system that can make this safe. -sigh-)
Change-Id: I2921ba451e04d86798752d140885d3c5cc08e146
this doesn't have a test because this code path is only reached by
clients that predate 2.4, and we really should not be caring about
those any more right now. even the test suite doesn't, and the few
tests that might care are disabled because they will not even work
Change-Id: Id9eb190065138fedb2c7d90c328ff9eb9d97385b