Commit graph

4321 commits

Author SHA1 Message Date
7ff60b7445 libstore: move Goal::exitCode to WorkResult
the field is simply duplicated between the two, and now that we can
return WorkResults from Worker::run we no longer need both of them.

Change-Id: I82fc47d050b39b7bb7d1656445630d271f6c9830
2024-10-05 20:17:20 +00:00
fc6291e46d libstore: return goal results from Worker::run()
this will be needed to move all interesting result fields out of Goal
proper and into WorkResult. once that is done we can treat goals as a
totally internal construct of the worker mechanism, which also allows
us to fully stop exposing unclear intermediate state to Worker users.

Change-Id: I98d7778a4b5b2590b7b070bdfc164a22a0ef7190
2024-10-05 20:12:13 +00:00
40f154c0ed libstore: remove Worker::topGoals
since we now propagate goal exceptions properly we no longer need to
check topGoals for a reason to abort early. any early abort reasons,
whether by exception or a clean top goal failure, can now be handled
by inspecting the goal result in the main loop. this greatly reduces
goal-to-goal interactions that do not happen at the main loop level.

since the underscore-free name is now available for use as variables
we'll migrate to that where we currently use `_topGoals` for locals.

Change-Id: I5727c5ea7799647c0a69ab76975b1a03a6558aa6
2024-10-05 19:53:30 +00:00
f389a54079 libstore: propagate goal exceptions using promises
drop childException since it's no longer needed. also makes
waitForInput, childFinished, and childTerminated redundant.

Change-Id: I05d88ffd323c5b5c909ac21056162f69ffb0eb9f
2024-10-05 19:44:47 +00:00
7ef4466018 libstore: have goals promise WorkResults, not void
Change-Id: Idd218ec1572eda84dc47accc0dcd8a954d36f098
2024-10-05 19:06:59 +00:00
a9f2aab226 libstore: extract Worker::goalFinished specifics
there's no reason to have the worker set information on goals that the
goals themselves return from their entry point. doing this in the goal
`work()` function is much cleaner, and a prerequisite to removing more
implicit strong shared references to goals that are currently running.

Change-Id: Ibb3e953ab8482a6a21ce2ed659d5023a991e7923
2024-10-05 19:06:59 +00:00
99edc2ae38 libstore: check for interrupts in parallel promise
this simplifies the worker loop, and lets us remove it entirely later.
note that ideally only one promise waiting for interrupts should exist
in the entire system. not one per event loop, one per *process*. extra
interrupt waiters make interrupt response nondeterministic and as such
aren't great for user experience. if anything wants to react to aborts
caused by explicit interruptions, or anything else, those things would
be much better served using RAII guards such as Finally (or KJ_DEFER).

Change-Id: I41d035ff40172d536e098153c7375b0972110d51
2024-10-05 19:06:59 +00:00
896a123605 libstore: remove Goal::StillAlive
this was a triumph. i'm making a note here: huge success. it's hard to
overstate my satisfaction! i'm not even angry. i'm being so sincere ri

actually, no. we *are* angry. this was one dumbass odyssey. nobody has
asked for this. but not doing it would have locked us into old, broken
protocols forever or (possibly worse) forced us to write our own async
framework building on the old did-you-mean-continuations in Worker. if
we had done that we'd be locked into ever more, and ever more complex,
manual state management all over the place. this just could not stand.

Change-Id: I43a6de1035febff59d2eff83be9ad52af4659871
2024-10-05 18:21:02 +00:00
86b213e632 Merge "Split ignoreException to avoid suppressing CTRL-C" into main 2024-10-05 17:33:00 +00:00
5b1715e633 libstore: forbid addWantedGoals when finished
due to event loop scheduling behavior it's possible for a derivation
goal to fully finish (having seen all paths it was asked to create),
but to not notify the worker of this in time to prevent another goal
asking the recently-finished goal for more outputs. if this happened
the finished goal would ignore the request for more outputs since it
considered itself fully done, and the delayed result reporting would
cause the requesting goal to assume its request had been honored. if
the requested goal had finished *properly* the worker would recreate
it instead of asking for more outputs, and this would succeed. it is
thus safe to always recreate goals once they are done, so we now do.

Change-Id: Ifedd69ca153372c623abe9a9b49cd1523588814f
2024-10-04 17:49:57 +00:00
ee0c195eba
Split ignoreException to avoid suppressing CTRL-C
This splits `ignoreException` into `ignoreExceptionExceptInterrupt`
(which ignores all exceptions except `Interrupt`, which indicates a
SIGINT/CTRL-C) and `ignoreExceptionInDestructor` (which ignores all
exceptions, so that destructors do not throw exceptions).

This prevents many cases where Nix ignores CTRL-C entirely.
See: https://github.com/NixOS/nix/issues/7245

Upstream-PR: https://github.com/NixOS/nix/pull/11618
Change-Id: Ie7d2467eedbe840d1b9fa2e88a4e88e4ab26a87b
2024-10-01 15:49:56 -07:00
7752927660 libstore: turn DerivationGoal::work into *one* promise
Change-Id: Ic2f7bc2bd6a1879ad614e4be81a7214f64eb0e85
2024-10-01 11:55:47 +00:00
3edc272341 libstore: turn DrvOutputSubstitutionGoal::work into *one* promise
Change-Id: I2d4dcedff0a278d2d8f3d264a9186dfb399275e2
2024-10-01 11:55:42 +00:00
9b05636937 libstore: make PathSubstitutionGoal::work *one* promise
Change-Id: I38cfe8c7059251b581f1013c4213804f36b985ea
2024-10-01 11:55:36 +00:00
9889c79fe3 libstore: turn Worker::updateStatistics into a promise
we'll now loop to update displayed statistics, and use this loop to
limit the update rate to 50 times per second. we could have updated
much more frequently before this (once per iteration of `runImpl`),
much faster than would ever be useful in practice. aggressive stats
updates can even impede progress due to terminal or network delays.

Change-Id: Ifba755a2569f73c919b1fbb06a142c0951395d6d
2024-10-01 11:55:29 +00:00
732de75f67 libstore: remove Worker::wakeUp()
Worker::run() is now entirely based on the kj event loop and promises,
so we need not handle awakeness of goals manually any more. every goal
can instead, once it has finished a partial work call, defer itself to
being called again in the next iteration of the loop. same end effect.

Change-Id: I320eee2fa60bcebaabd74d1323fa96d1402c1d15
2024-10-01 13:55:03 +02:00
d5db0b1abc libstore: turn periodic gc attempt into a promise
notably we will check whether we want to do GC at all only once during
startup, and we'll only attempt GC every ten seconds rather than every
time a goal has finished a partial work call. this shouldn't cause any
problems in practice since relying on auto-gc is not deterministic and
stores in which builds can fill all remaining free space in merely ten
seconds are severely troubled even when gargage collection runs a lot.

Change-Id: I1175a56bf7f4e531f8be90157ad88750ff2ddec4
2024-10-01 11:36:45 +00:00
b0c7c1ec66 libstore: turn Worker::run() main loop into a promise
Change-Id: Ib112ea9a3e67d5cb3d7d0ded30bbd25c96262470
2024-10-01 11:36:45 +00:00
d31310bf59 libstore: turn waitForInput into a promise
Change-Id: I8355d8d3f6c43a812990c1912b048e5735b07f7b
2024-10-01 11:36:45 +00:00
8e05cc1e6c Revert "libstore: remove worker removeGoal"
Revert submission 1946

Reason for revert: regression in building (found via bisection)

Reported by users:
> error: path '/nix/store/04ca5xwvasz6s3jg0k7njz6rzi0d225w-jq-1.7.1-dev' does not exist in the store

Reverted changes: /q/submissionid:1946

Change-Id: I6f1a4b2f7d7ef5ca430e477fc32bca62fd97036b
2024-10-01 11:07:57 +00:00
aa33c34c9b libstore: merge ContinueImmediately and StillAlive
nothing needs to signal being still active but not actively pollable,
only that immediate polling for the next goal work phase is in order.

Change-Id: Ia43c1015e94ba4f5f6b9cb92943da608c4a01555
2024-09-29 15:29:56 +00:00
ccd2862666 libstore: remove worker removeGoal
this was immensely inefficient on large caches, as can exist when many
derivations are buildable simultaneously. since we have smart pointers
to goals we can do cache maintenance in goal deleters instead, and use
the exact iterators instead of doing a linear search. this *does* rely
on goals being deleted to remove them from the cache, which isn't true
for toplevel goals. those would have previously been removed when done
in all cases, removing the cache entry when keep-going is set. this is
arguably incorrect since it might result in those goals being retried,
although that could only happen with dynamic derivations or the likes.
(luckily dynamic derivations not complete enough to allow this at all)

Change-Id: I8e750b868393588c33e4829333d370f2c509ce99
2024-09-29 15:29:56 +00:00
47ddd11933 libstore: extract a real makeGoalCommon
makeDerivationGoalCommon had the right idea, but it didn't quite go far
enough. let's do the rest and remove the remaining factory duplication.

Change-Id: I1fe32446bdfb501e81df56226fd962f85720725b
2024-09-29 15:07:30 +00:00
7f4f86795c libstore: remove Goal::key
this was a debugging aid from day one that should not have any impact on
build semantics, and if it *does* have an impact on build semantics then
build semantics are seriously broken. keeping the order imposed by these
keys will be impossible once we let a real event loop schedule our jobs.

Change-Id: I5c313324e1f213ab6453d82f41ae5e59de809a5b
2024-09-29 14:29:14 +00:00
a5240b23ab libstore: make non-cache goal pointers strong
without circular references we do not need weak goal pointers except for
caches, which should not prevent goal destructors running. caches though
cannot create circular references even when they keep strong references.
if we removed goals from caches when their work() is fully finished, not
when their destructors are run, we could keep strong pointers in caches.
since we do not gain much from this we keep those pointers weak for now.

Change-Id: I1d4a6850ff5e264443c90eb4531da89f5e97a3a0
2024-09-29 14:29:14 +00:00
8fb642b6e0 libstore: remove Goal::WaitForWorld
have DerivationGoal and its subclasses produce a wrapper promise for
their intermediate results instead, and return this wrapper promise.
Worker already handles promises that do not complete immediately, so
we do not have to duplicate this into an entire result type variant.

Change-Id: Iae8dbf63cfc742afda4d415922a29ac5a3f39348
2024-09-29 14:29:14 +00:00
1a52e4f755 libstore: fix build tests
the new event loop could very occasionally notice that a dependency of
some goal has failed, process the failure, cause the depending goal to
fail accordingly, and in the doing of the latter two steps let further
dependencies that previously have not been reported as failed do their
reporting anyway. in such cases a goal could fail with "1 dependencies
failed", but more than one dependency failure message was shown. we'll
now report the correct number of failed dependency goals in all cases.

Change-Id: I5aa95dcb2db4de4fd5fee8acbf5db833531d81a8
2024-09-29 13:17:15 +00:00
3f7519526f libstore: have makeLocalDerivationGoal return unique_ptrs
these can be unique rather than shared because shared_ptr has a
converting constructor. preparatory refactor for something else
and not necessary on its own, and the extra allocations we must
do for shared_ptr control blocks isn't usually relevant anyway.

Change-Id: I5391715545240c6ec8e83a031206edafdfc6462f
2024-09-29 12:09:24 +00:00
ae5d8dae1b libstore: turn Goal::WaitForGoals into a promise
also gets rid of explicit strong references to dependencies of any goal,
and weak references to dependers as well. those are now only held within
promises representing goal completion and thus independent of the goal's
relation to each other. the weak references to dependers was only needed
for notifications, and that's much better handled entirely by kj itself.

Change-Id: I00d06df9090f8d6336ee4bb0c1313a7052fb016b
2024-09-27 16:40:27 +02:00
852da07b67 libstore: replace Goal::WaitForSlot with semaphores
now that we have an event loop in the worker we can use it and its
magical execution suspending properties to replace the slot counts
we managed explicitly with semaphores and raii tokens. technically
this would not have needed an event loop base to be doable, but it
is a whole lot easier to wait for a token to be available if there
is a callback mechanism ready for use that doesn't require a whole
damn dedicated abstract method in Goal to work, and specific calls
to that dedicated method strewn all over the worker implementation

Change-Id: I1da7cf386d94e2bbf2dba9b53ff51dbce6a0cff7
2024-09-27 16:40:27 +02:00
bf32085d63 libstore: simplify Worker::waitForInput
with waitForAWhile turned into promised the core functionality of
waitForInput is now merely to let gc run every so often if needed

Change-Id: I68da342bbc1d67653901cf4502dabfa5bc947628
2024-09-27 16:40:26 +02:00
cd1ceffb0e libstore: make waiting for a while a promise
this simplifies waitForInput quite a lot, and at the same time makes
polling less thundering-herd-y. it even fixes early polling wakeups!

Change-Id: I6dfa62ce91729b8880342117d71af5ae33366414
2024-09-27 16:39:33 +02:00
0478949c72 libstore: turn builder output processing into event loop
this removes the rather janky did-you-mean-async poll loop we had so
far. sadly kj does not play well with pty file descriptors, so we do
have to add our own async input stream that does not eat pty EIO and
turns it into an exception. that's still a *lot* better than the old
code, and using a real even loop makes everything else easier later.

Change-Id: Idd7e0428c59758602cc530bcad224cd2fed4c15e
2024-09-27 16:38:16 +02:00
14dc84ed03 Merge changes Iaa2e0e9d,Ia973420f into main
* changes:
  Fix passing custom CA files into the builtin:fetchurl sandbox
  [security] builtin:fetchurl: Enable TLS verification
2024-09-26 20:53:46 +00:00
37b22dae04 Fix passing custom CA files into the builtin:fetchurl sandbox
Without this, verifying TLS certificates would fail on macOS, as well
as any system that doesn't have a certificate file at /etc/ssl/certs/ca-certificates.crt,
which includes e.g. Fedora.

Change-Id: Iaa2e0e9db3747645b5482c82e3e0e4e8f229f5f9
2024-09-26 15:25:28 +00:00
Eelco Dolstra
c1631b0a39 [security] builtin:fetchurl: Enable TLS verification
This is better for privacy and to avoid leaking netrc credentials in a
MITM attack, but also the assumption that we check the hash no longer
holds in some cases (in particular for impure derivations).

Partially reverts 5db358d4d7.

(cherry picked from commit c04bc17a5a0fdcb725a11ef6541f94730112e7b6)
(cherry picked from commit f2f47fa725fc87bfb536de171a2ea81f2789c9fb)
(cherry picked from commit 7b39cd631e0d3c3d238015c6f450c59bbc9cbc5b)

Upstream-PR: https://github.com/NixOS/nix/pull/11585

Change-Id: Ia973420f6098113da05a594d48394ce1fe41fbb9
2024-09-25 18:40:58 -07:00
19e0ce2c03 main: log stack traces for std::terminate
These stack traces kind of suck for the reasons mentioned on the
CppTrace page here (no symbols for inline functions is a major one):
https://github.com/jeremy-rifkin/cpptrace

I would consider using CppTrace if it were packaged, but to be honest, I
think that the more reasonable option is actually to move entirely to
out-of-process crash handling and symbolization.

The reason for this is that if you want to generate anything of
substance on SIGSEGV or really any deadly signal, you are stuck in
async-signal-safe land, which is not a place to be trying to run a
symbolizer. LLVM does it anyway, probably carefully, and chromium *can*
do it on debug builds but in general uses crashpad:
https://source.chromium.org/chromium/chromium/src/+/main:base/debug/stack_trace_posix.cc;l=974;drc=82dff63dbf9db05e9274e11d9128af7b9f51ceaa;bpv=1;bpt=1

However, some stack traces are better than *no* stack traces when we get
mystery exceptions falling out the bottom of the program. I've also
promoted the path for "mystery exceptions falling out the bottom of the
program" to hard crash and generate a core dump because although there's
been some months since the last one of these, these are nonetheless
always *atrociously* diagnosed.

We can't improve the crash handling further until either we use Crashpad
(which involves more C++ deps, no thanks) or we put in the ostensibly
work in progress Rust minidump infrastructure, in which case we need to
finish full support for Rust in libutil first.

Sample report:

Lix crashed. This is a bug. We would appreciate if you report it at https://git.lix.systems/lix-project/lix/issues with the following information included:

Exception: std::runtime_error: lol
Stack trace:
 0# nix::printStackTrace() in /home/jade/lix/lix3/build/src/nix/../libutil/liblixutil.so
 1# 0x000073C9862331F2 in /home/jade/lix/lix3/build/src/nix/../libmain/liblixmain.so
 2# 0x000073C985F2E21A in /nix/store/p44qan69linp3ii0xrviypsw2j4qdcp2-gcc-13.2.0-lib/lib/libstdc++.so.6
 3# 0x000073C985F2E285 in /nix/store/p44qan69linp3ii0xrviypsw2j4qdcp2-gcc-13.2.0-lib/lib/libstdc++.so.6
 4# nix::handleExceptions(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::function<void ()>) in /home/jade/lix/lix3/build/src/nix/../libmain/liblixmain.so
 5# 0x00005CF65B6B048B in /home/jade/lix/lix3/build/src/nix/nix
 6# 0x000073C985C8810E in /nix/store/dbcw19dshdwnxdv5q2g6wldj6syyvq7l-glibc-2.39-52/lib/libc.so.6
 7# __libc_start_main in /nix/store/dbcw19dshdwnxdv5q2g6wldj6syyvq7l-glibc-2.39-52/lib/libc.so.6
 8# 0x00005CF65B610335 in /home/jade/lix/lix3/build/src/nix/nix

Change-Id: I1a9f6d349b617fd7145a37159b78ecb9382cb4e9
2024-09-25 14:03:45 -07:00
5f298f74c9 Merge "local-store: make extended attribute handling more robust" into main 2024-09-21 07:55:13 +00:00
789b19a0cf util: fix brotli decompression of empty input
This caused an infinite loop before since it would just keep asking the
underlying source for more data.

In practice this happened because an HTTP server served a
response to a HEAD request (for which curl will not retrieve any body or
call our write callback function) with Content-Encoding: br, leading to
decompressing nothing at all and going into an infinite loop.

This adds a test to make sure none of our compression methods do that
again, as well as just patching the HTTP client to never feed empty data
into a compression algorithm (since they absolutely have the right to
throw CompressionError on unexpectedly-short streams!).

Reported on Matrix: https://matrix.to/#/!lymvtcwDJ7ZA9Npq:lix.systems/$8BWQR_zKxCQDJ40C5NnDo4bQPId3pZ_aoDj2ANP7Itc?via=lix.systems&via=matrix.org&via=tchncs.de

Change-Id: I027566e280f0f569fdb8df40e5ecbf46c211dad1
2024-09-18 15:37:29 -07:00
5246cea6c8 Merge "store: add a hint on how to fix Lix installs broken by macOS Sequoia" into main 2024-09-14 19:28:24 +00:00
8f88590d13 Merge changes Ia1481da4,Ifca1d74d into main
* changes:
  archive: refactor bad mutable-state API in the NAR parse listener
  archive: rename ParseSink to NARParseVisitor
2024-09-14 19:26:08 +00:00
3f07c65510
local-store: make extended attribute handling more robust
* Move the extended attribute deletion after the hardlink sanity check. We
  shouldn't be removing extended attributes on random files.
* Make the entity owner-writable before attempting to remove extended
  attributes, since this operation usually requires write access on the file,
  and we shouldn't fail xattr deletion on a file that has been made unwritable
  by the builder or a previous canonicalisation pass.

Fixes: lix-project/lix#507
Change-Id: I7e6ccb71649185764cd5210f4a4794ee174afea6
2024-09-14 10:36:22 +02:00
b7fc37b015 store: add a hint on how to fix Lix installs broken by macOS Sequoia
This is not a detailed diagnosis, and it's not worth writing one, tbh.
This error basically never happens in normal operation, so diagnosing it
by changing the error on macOS is good enough.

Relevant: lix-project/lix-installer#24
Relevant: lix-project/lix-installer#18
Relevant: lix-project/lix#521

Change-Id: I03701f917d116575c72a97502b8e1617679447f2
2024-09-14 07:31:30 +00:00
ca1dc3f70b archive: refactor bad mutable-state API in the NAR parse listener
Remove the mutable state stuff that assumes that one file is being
written a time. It's true that we don't write multiple files
interleaved, but that mutable state is evil.

Change-Id: Ia1481da48255d901e4b09a9b783e7af44fae8cff
2024-09-13 17:11:43 -07:00
81c2e0ac8e archive: rename ParseSink to NARParseVisitor
- Rename the listener to not be called a "sink". If it were a "sink" it
  would be eating bytes and conform with any of the Nix sink stuff
  (maybe FileHandle should be a Sink itself! but that's a later CL's
  problem). This is a parser listener.
- Move the RetrieveRegularNARSink thing into store-api.cc, which is its
  only usage, and fix it to actually do what it is stated to do: crash
  if its invariants are violated.

  It's, of course, used to erm, unpack single-file NAR files, generated
  via a horrible contraption of sources and sinks that looks like a
  plumbing blueprint. Refactoring that is a future task.
- Add a description of the invariants of NARParseVisitor in preparation
  of refactoring it.

Change-Id: Ifca1d74d2947204a1f66349772e54dad0743e944
2024-09-11 01:10:49 -07:00
8f7ab26f96 Merge changes If8ec210f,I6e2851b2 into main
* changes:
  libfetchers: serialise accept-flake-config properly
  libstore: declare SandboxMode JSON serialisation in the header
2024-09-09 16:14:23 +00:00
f2a49032a6 libstore: turn Worker in a kj event loop user
using a proper event loop basis we no longer have to worry about most of
the intricacies of poll(), or platform-dependent replacements for it. we
may even be able to use the event loop and its promise system for all of
our scheduling in the future. we don't do any real async processing yet,
this is just preparation to separate the first such change from the huge
api design difference with the async framework we chose (kj from capnp):

kj::Promise, unlike std::future, doesn't return exceptions unmangled. it
instead wraps any non-kj exception into a kj exception, erasing all type
information and preserving mostly the what() string in the process. this
makes sense in the capnp rpc use case where unrestricted exception types
can't be transferred, and since it moves error handling styles closer to
a world we'd actually like there's no harm in doing it only here for now

Change-Id: I20f888de74d525fb2db36ca30ebba4bcfe9cc838
2024-09-08 01:57:48 +00:00
d7c37324bb
libstore: declare SandboxMode JSON serialisation in the header
The JSON serialisation should be declared in the header so that all translation
units can see it when needed, even though it seems that it has not been used
anywhere else so far. Unfortunately, this means we cannot use the
NLOHMANN_JSON_SERIALIZE_ENUM convenience macro, since it uses a slightly
different signature, but the code is not too bad either.

Change-Id: I6e2851b250e0b53114d2fecb8011ff1ea9379d0f
2024-09-02 18:50:14 +02:00
b7b1b9723f
Clarify that diff-hook no longer needs to be an absolute path
See: https://gerrit.lix.systems/c/lix/+/1864
Change-Id: Ic70bfe42b261a83f2cb68b8f102833b739b8e03a
2024-09-01 15:20:09 -07:00
02eb07cfd5 Merge changes I5566a985,I88cf53d3 into main
* changes:
  Support relative and `~/` paths in config settings
  Thread `ApplyConfigOptions` through config parsing
2024-09-01 22:06:36 +00:00