lix/src
Jade Lovelace b3fb8d9822 daemon: fix a crash bug "FATAL: exception not rethrown"
This is caused by pthread_cancel effectively throwing a
not-specifically-identifiable C++ exception into the targeted thread,
which, if it is not rethrown, terminates the process entirely.

This is rather "impolite" behaviour, we would say. But thread
cancellation is *always* busted, and we should simply not use it where
unnecessary. It's particularly unnecessary when what we *actually* need
it for is, err, interrupting a poll(2).

That can in turn be achieved by simply listening to more stuff in the
poll, namely, a pipe, which we send a character to when needing to
stop the thread.

While looking at this code, we also investigated whether any of the
poll() madness is required, or was even *ever* required. Curiously we
found in the XNU kernel source code that the thing about needing to
listen to POLLHUP is probably *correct*, but switching it to POLLRDNORM
should not have made any difference at all. We've left a FIXME to look
into that further because what's written here is super janky.

94d3b45284/bsd/kern/sys_generic.c (L1751-L1758)

This is the crash on some Hydra machines:

Thread 1 (Thread 0x7f56b77776c0 (LWP 955542) (Exiting)):
0  0x00007f56b8e9b7dc in __pthread_kill_implementation () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
1  0x00007f56b8e49516 in raise () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
2  0x00007f56b8e31935 in abort () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
3  0x00007f56b8e327f3 in __libc_message_impl.cold () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
4  0x00007f56b8e8e8e9 in __libc_fatal () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
5  0x00007f56b8ea23c4 in unwind_cleanup () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
6  0x00007f56b9d2a1b8 in nix::triggerInterrupt() [clone .cold] () from /nix/store/sahgw550p621m9dy1pd7whl9c5g1g0p7-lix-2.90.0-rc1/lib/liblixutil.so
7  0x00007f56b990ac9d in std:🧵:_State_impl<std:🧵:_Invoker<std::tuple<nix::MonitorFdHup::MonitorFdHup(int)::{lambda()#1}> > >::_M_run() () from /nix/store/sahgw550p621m9dy1pd7whl9c5g1g0p7-lix-2.90.0-rc1/lib/liblixstore.so
8  0x00007f56b90e86d3 in execute_native_thread_routine () from /nix/store/c6r62m84hywf4i6qq1h28f13zv38yqyp-gcc-13.3.0-lib/lib/libstdc++.so.6
9  0x00007f56b8e99a42 in start_thread () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6
10 0x00007f56b8f1905c in clone3 () from /nix/store/m71p7f0nymb19yn1dascklyya2i96jfw-glibc-2.39-52/lib/libc.so.6

As for testing, we've started a daemon with this change and verified it
deals with HUPs correctly on x86_64-linux, but I don't think we can
easily test the destructor behaviour without whatever Hydra was
doing that broke.

Change-Id: I29c7de0425674494b6e43c075810126c3ff77363
2024-07-13 00:59:33 +02:00
..
build-remote build-remote: truncate+hash store URI used in lockfile paths 2024-05-31 12:18:24 +00:00
libcmd language: cleanly ban integer overflows 2024-07-13 00:59:33 +02:00
libexpr language: cleanly ban integer overflows 2024-07-13 00:59:33 +02:00
libfetchers libutil: return sources from runProgram2 2024-07-06 12:36:36 +02:00
libmain libmain: clear display attributes in the multiline progress bar 2024-07-08 19:08:23 +02:00
libstore language: cleanly ban integer overflows 2024-07-13 00:59:33 +02:00
libutil daemon: fix a crash bug "FATAL: exception not rethrown" 2024-07-13 00:59:33 +02:00
nix libutil: turn HashModuloSink into a free function 2024-07-06 12:36:37 +02:00
nix-build tree-wide: unify progress bar inactive and paused states 2024-07-01 18:19:34 +02:00
nix-channel util.{hh,cc}: Split out users.{hh,cc} 2024-05-29 11:01:34 +02:00
nix-collect-garbage Fix dry-run flag for nix-collect-garbage 2024-07-09 13:55:05 +00:00
nix-copy-closure Merge pull request #9277 from keszybz/file-permissions 2024-03-04 05:26:17 +01:00
nix-env Use std::strong_ordering for version comparison 2024-07-12 16:48:28 +02:00
nix-instantiate libexpr: pass Exprs as references, not pointers 2024-06-17 19:46:44 +00:00
nix-store libstore: convert dumpPath to a generator 2024-07-05 22:28:16 +00:00
pch build-time: remove 20% more by PCH'ing C++ stdlib 2024-05-30 21:54:21 +00:00
resolve-system-dependencies remove the autoconf+Make buildsystem 2024-05-07 17:04:30 -06:00
lix-base.pc.in packaging: rename nixexpr -> lixexpr and so on 2024-05-23 16:45:23 -06:00
meson.build meson: implement functional tests 2024-03-27 18:37:50 -06:00