Eelco Dolstra
bcaad1c934
openConnection(): Don't throw exceptions in forked child
...
On hydra.nixos.org the queue runner had child processes that were
stuck handling an exception:
Thread 1 (Thread 0x7f501f7fe640 (LWP 1413473) "bld~v54h5zkhmb3"):
#0 futex_wait (private=0, expected=2, futex_word=0x7f50c27969b0 <_rtld_local+2480>) at ../sysdeps/nptl/futex-internal.h:146
#1 __lll_lock_wait (futex=0x7f50c27969b0 <_rtld_local+2480>, private=0) at lowlevellock.c:52
#2 0x00007f50c21eaee4 in __GI___pthread_mutex_lock (mutex=0x7f50c27969b0 <_rtld_local+2480>) at ../nptl/pthread_mutex_lock.c:115
#3 0x00007f50c1854bef in __GI___dl_iterate_phdr (callback=0x7f50c190c020 <_Unwind_IteratePhdrCallback>, data=0x7f501f7fb040) at dl-iteratephdr.c:40
#4 0x00007f50c190d2d1 in _Unwind_Find_FDE () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
#5 0x00007f50c19099b3 in uw_frame_state_for () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
#6 0x00007f50c190ab90 in uw_init_context_1 () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
#7 0x00007f50c190b08e in _Unwind_RaiseException () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
#8 0x00007f50c1b02ab7 in __cxa_throw () from /nix/store/dd8swlwhpdhn6bv219562vyxhi8278hs-gcc-10.3.0-lib/lib/libstdc++.so.6
#9 0x00007f50c1d01abe in nix::parseURL (url="root@cb893012.packethost.net") at src/libutil/url.cc:53
#10 0x0000000000484f55 in extraStoreArgs (machine="root@cb893012.packethost.net") at build-remote.cc:35
#11 operator() (__closure=0x7f4fe9fe0420) at build-remote.cc:79
...
Maybe the fork happened while another thread was holding some global
stack unwinding lock
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744 ). Anyway, since
the hanging child inherits all file descriptors to SSH clients,
shutting down remote builds (via 'child.to = -1' in
State::buildRemote()) doesn't work and 'child.pid.wait()' hangs
forever.
So let's not do any significant work between fork and exec.
2022-03-30 22:39:48 +02:00
Graham Christensen
bbb0998699
Merge pull request #1188 from DeterminateSystems/nix-2.7
...
Nix 2.7
2022-03-29 15:49:37 -04:00
ajs124
089da272c7
fix build against nix 2.7.0
...
fix build after such commits as df552ff53e68dff8ca360adbdbea214ece1d08ee
and e862833ec662c1bffbe31b9a229147de391e801a
2022-03-29 15:38:24 -04:00
ajs124
c64c5f0a7e
hydra-queue-runner: rename build-result.hh to hydra-build-result.hh
2022-03-29 15:34:29 -04:00
Graham Christensen
4368ff5d5b
flake.lock: Add
...
Flake lock file updates:
• Added input 'nix':
'github:NixOS/nix/ffe155abd36366a870482625543f9bf924a58281' (2022-03-07)
• Added input 'nix/lowdown-src':
'github:kristapsdz/lowdown/d2c2b44ff6c27b936ec27358a2653caaef8f73b8' (2021-10-06)
• Added input 'nix/nixpkgs':
'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
• Added input 'nix/nixpkgs-regression':
'github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2' (2022-01-24)
• Added input 'nixpkgs':
follows 'nix/nixpkgs'
2022-03-29 15:33:08 -04:00
Graham Christensen
98da457e16
nix: 2.7.0
2022-03-29 15:31:11 -04:00
Graham Christensen
20a8437094
flake.nix: set nix to 2.6.0
2022-03-29 15:29:33 -04:00
Graham Christensen
fd3690a0c1
flake.lock: Update
...
Flake lock file updates:
• Updated input 'nix':
'github:NixOS/nix/a6ba313a0aac3b6e2fef434cb42d190a0849238e' (2021-08-10)
→ 'github:NixOS/nix/a1cd7e58606a41fcf62bf8637804cf8306f17f62' (2022-01-24)
• Updated input 'nix/lowdown-src':
'github:kristapsdz/lowdown/148f9b2f586c41b7e36e73009db43ea68c7a1a4d' (2021-04-03)
→ 'github:kristapsdz/lowdown/d2c2b44ff6c27b936ec27358a2653caaef8f73b8' (2021-10-06)
• Updated input 'nix/nixpkgs':
'github:NixOS/nixpkgs/f77036342e2b690c61c97202bf48f2ce13acc022' (2021-06-28)
→ 'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
• Added input 'nix/nixpkgs-regression':
'github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2' (2022-01-24)
2022-03-29 15:29:23 -04:00
Graham Christensen
3b048ed136
Revert "Revert "Use copyClosure
instead of computeFSClosure
+ copyPaths
""
...
This reverts commit 8e3ada2afc
.
2022-03-29 15:28:47 -04:00
Cole Helbling
9c1f36c47c
t/lib/HydraTestContext: set queue runner port to 0
...
This makes the exposer choose a random, available port.
2022-03-29 11:41:23 -07:00
Cole Helbling
4789eba92c
hydra-queue-runer: split metrics functionality into its own function
2022-03-29 10:55:28 -07:00
Cole Helbling
928b3b8268
hydra-queue-runner: fix priority of flag over config file
2022-03-29 10:42:07 -07:00
Cole Helbling
5ddb9a98ca
fixup! hydra-queue-runner: log message before and after exporter is started
2022-03-29 08:47:41 -07:00
Cole Helbling
905a7a7beb
hydra-queue-runner: read metrics port from queue_runner_metrics_port
config
2022-03-29 08:46:43 -07:00
Cole Helbling
9cdc5aceed
hydra-queue-runner: log message before and after exporter is started
...
This way, if something goes wrong between the two, it's easier to narrow
down where the issue lies.
2022-03-29 08:41:19 -07:00
Théophane Hufschmitt
6e571e26ff
Build the resolved derivation and not the original one
2022-03-29 17:05:30 +02:00
Théophane Hufschmitt
92b627ac1b
Remove an accidental re-indenting of a comment
...
Co-authored-by: Eelco Dolstra <edolstra@gmail.com>
2022-03-29 17:04:19 +02:00
Théophane Hufschmitt
b430d41afd
Use the BuildOptions
more eagerly
2022-03-29 17:04:19 +02:00
Théophane Hufschmitt
fd0ae78eba
Factor out the copying from the build store
2022-03-29 17:04:19 +02:00
Théophane Hufschmitt
a778a89f04
Factor out the queryPathInfos
part of the build
2022-03-29 17:04:19 +02:00
Théophane Hufschmitt
365776f5d7
Factor out the building part
2022-03-29 17:04:19 +02:00
Théophane Hufschmitt
9f1b911625
Factor more stuff out
2022-03-29 17:04:17 +02:00
Théophane Hufschmitt
2f494b7834
Factor out the creation of the log file
2022-03-29 16:52:59 +02:00
Théophane Hufschmitt
5db8642224
Factor out a struct representing a connection to a machine
2022-03-29 16:52:59 +02:00
Graham Christensen
78ef4ae9a5
Merge pull request #1187 from DeterminateSystems/revert-to-nix-2.4pre20210810_a6ba313
...
Revert "Build against Nix 2.5.1" - build against nix-2.4pre20210810_a…
2022-03-29 09:35:15 -04:00
Graham Christensen
dc709422a6
Revert "Build against Nix 2.5.1" - build against nix-2.4pre20210810_a6ba313
...
This reverts commit 921e27d6c0
.
2022-03-29 09:24:51 -04:00
Graham Christensen
47c7170c52
Merge pull request #1185 from DeterminateSystems/revert-from-nix-2.6
...
Revert to Nix 2.5.1
2022-03-28 14:45:15 -04:00
Cole Helbling
921e27d6c0
Build against Nix 2.5.1
2022-03-28 11:36:14 -07:00
Cole Helbling
2ba83a5cba
t/jobs/empty-dir-builder: provide output for nix log
2022-03-28 09:54:02 -07:00
Cole Helbling
127a644595
Revert "Update Nix to 2.6"
...
This reverts commit 5ae26aa760
.
2022-03-28 09:54:02 -07:00
Cole Helbling
8e3ada2afc
Revert "Use copyClosure
instead of computeFSClosure
+ copyPaths
"
...
This reverts commit f14c583ce5
.
2022-03-28 09:54:02 -07:00
Eelco Dolstra
962bf36939
Merge pull request #1162 from obsidiansystems/less-ref
...
Make `copyClosureTo` take a regular C++ ref to the store
2022-03-23 16:25:59 +01:00
Eelco Dolstra
3390415905
Merge pull request #1125 from obsidiansystems/simplify--copyClosure
...
Use `copyClosure` instead of `computeFSClosure` + `copyPaths`
2022-03-23 12:49:22 +01:00
Cole Helbling
8503a7917b
fixup! hydra-queue-runner: make registry member of State, configurable metrics port
2022-03-22 13:38:13 -07:00
Graham Christensen
01fb23ddf6
Merge pull request #1178 from DeterminateSystems/hydra-update-gc-roots/network-traffic
...
hydra-update-gc-roots: allow cached refs to the build's jobset
2022-03-21 09:05:54 -04:00
Graham Christensen
e5393c2cf8
fixup: make id non-ambiguous
2022-03-19 23:56:47 -04:00
Graham Christensen
137be3452e
Reduce the jobset cols on the remaining two queries
2022-03-19 23:56:47 -04:00
Graham Christensen
f353a7ac41
update-gc-roots: try subselecting the jobset table
2022-03-19 23:56:47 -04:00
Graham Christensen
145667cb53
hydra-update-gc-roots: allow cached refs to the build's jobset
...
Re-executing this search_related on every access turned out to
create very problematic performance. If a jobset had a lot of
error output stored in the jobset, and there were many hundreds
or thousands of active jobs, this could easily cause >1Gbps of
network traffic.
2022-03-19 23:56:47 -04:00
Graham Christensen
22026da4f8
Merge pull request #1176 from DeterminateSystems/broken-constituent
...
Broken constituents: emit useful log messages on evaluation errors on constituents
2022-03-19 14:55:17 -04:00
Graham Christensen
a582e4c485
HydraTestContext: add \n's to various dies
2022-03-19 14:46:53 -04:00
Graham Christensen
074a2f96bf
hydra-eval-jobset: emit a useful error if constituents errored
2022-03-19 14:37:12 -04:00
Graham Christensen
0c51de6334
hydra-evaluate-jobset: assert it logs errored constituents properly
2022-03-19 14:35:30 -04:00
Graham Christensen
25f6bae847
HydraTestContext: make it easy to create a jobset without evaluating
2022-03-19 14:34:43 -04:00
Cole Helbling
b0c17112c9
flake: update to nixos-unstable-small
...
https://github.com/NixOS/nixpkgs/pull/163695 was merged, so no longer
need to use my commit!
2022-03-18 11:10:57 -07:00
Cole Helbling
c0f826b92d
hydra-queue-runner: get the listening port from the exposer itself
...
Otherwise, when the port is randomly chosen (e.g. by specifying no port,
or a port of 0), it will just show that the port is 0 and not the port
that is actually serving the metrics.
2022-03-14 08:41:45 -07:00
Cole Helbling
52a29d43e6
hydra-queue-runner: make registry member of State, configurable metrics port
...
Thanks to the updated prometheus-cpp library, specifying a port of 0
will cause it to pick a random (available) port -- ideal for tests.
2022-03-11 11:58:10 -08:00
Cole Helbling
6e6475d860
flake: replace aliases with their proper names
...
Newer Nixpkgs have added a throw for these aliases.
2022-03-11 11:58:10 -08:00
Cole Helbling
a0cb73579d
flake: update newNixpkgs for newer prometheus-cpp
2022-03-11 11:58:10 -08:00
Cole Helbling
3bf31bd6a6
hydra-queue-runner: add simple "up" exporter
...
There are probably better ways to achieve this (and will likely need to
be refactored a bit to support further metrics).
2022-03-10 12:36:58 -08:00