Commit graph

4156 commits

Author SHA1 Message Date
a8b590014b
Fix email notifications for jobsets w/git-inputs
I started to wonder quite recently why Hydra doesn't send email
notifications anymore to me. I saw the following issue in the log of
`hydra-notify.service`:

    May 22 11:57:29 hydra 9bik0bxyxbrklhx6lqwifd6af8kj84va-hydra-notify[1887289]: fatal: unsafe repository ('/var/lib/hydra/scm/git/3e70c16c266ef70dc4198705a688acccf71e932878f178277c9ac47d133cc663' is owned by someone else)
    May 22 11:57:29 hydra 9bik0bxyxbrklhx6lqwifd6af8kj84va-hydra-notify[1887289]: To add an exception for this directory, call:
    May 22 11:57:29 hydra 9bik0bxyxbrklhx6lqwifd6af8kj84va-hydra-notify[1887289]:         git config --global --add safe.directory /var/lib/hydra/scm/git/3e70c16c266ef70dc4198705a688acccf71e932878f178277c9ac47d133cc663
    May 22 11:57:29 hydra 9bik0bxyxbrklhx6lqwifd6af8kj84va-hydra-notify[1886654]: error running build_finished hooks: command `git log --pretty=format:%H%x09%an%x09%ae%x09%at b0c30a7557685d25a8ab3f34fdb775e66db0bc4c..eaf28389fcebc2beca13a802f79b2cca6e9ca309 --git-dir=.git' failed with e>

This is also a problem because of Git's fix for CVE-2022-24765[1], so I
applied the same fix as for Nix[2], by using `--git-dir` which skips the
code-path for the ownership-check[3].

[1] https://lore.kernel.org/git/xmqqv8veb5i6.fsf@gitster.g/
[2] https://github.com/NixOS/nix/pull/6440
[3] To quote `git(1)`:
    > Specifying the location of the ".git" directory using this option
    > (or GIT_DIR environment variable) turns off the repository
    > discovery that tries to find a directory with ".git" subdirectory
2022-05-22 14:14:14 +02:00
Janne Heß
27023b514a
Merge pull request #1210 from ulrikstrid/ulrikstrid-fix-github-pr
GithubPulls: Don't fail on missing `Link`
2022-05-18 17:48:50 +02:00
Ulrik Strid
3c71be5b5b GithubPulls: Don't fail on missing Link 2022-05-18 08:14:00 +02:00
Graham Christensen
7c133a98f8
Merge pull request #1203 from DeterminateSystems/ds-33/wait-child-on-eof
hydra-eval-jobs: Report forked worker process status information on exception
2022-05-02 15:35:12 -04:00
Kayla Firestack
065039beba feat(t/evaluator/evaluate-oom): comment intentions 2022-05-02 15:26:26 -04:00
Kayla Firestack
87f610e7c1 fix(t/evaluator/evaluate-oom): use test_context to get path to ./t/jobs instead of relative paths 2022-05-02 15:14:46 -04:00
Kayla Firestack
013a1dcabc fix(t/evaluator/evaluate-oom): check that the exit value of the systemd-run check is zero. Rework skip messages 2022-05-02 15:13:59 -04:00
Kayla Firestack
e917d9e546 fix(t/evaluator/evaluate-oom): convert systemd-run presence check to eval, fix indentaion, show relationships between flags and commands with indentation 2022-05-02 14:40:13 -04:00
Kayla Firestack
01ec004108 feat(t/evaluator/evaluate-oom-job): skip test if systemd-run is not present 2022-05-02 14:08:50 -04:00
Kayla Firestack
2c909c038f feat(t/evaluator/hydra-eval-jobs): add basic evaluation test for hydra-eval-jobs 2022-05-02 13:50:57 -04:00
Kayla Firestack
90769ab5ad feat(t/jobs): add test job to cause an OOM 2022-05-02 13:49:32 -04:00
Kayla Firestack
2cdd7974de fix(hydra-eval-jobs): fix typo 2022-04-29 13:06:16 -04:00
Kayla Firestack
62cdbc4138 feat(hydra-eval-jobs.cc): add check_pid_status_nonblocking to catch handler 2022-04-21 10:55:51 -04:00
Kayla Firestack
cb4fa0000f fix(hydra-eval-jobs.cc): add function to report pid status 2022-04-21 10:55:51 -04:00
Graham Christensen
5c90edd19f
Merge pull request #1103 from DeterminateSystems/runcommand/dynamic
Dynamic RunCommand
2022-04-19 10:09:47 -04:00
dependabot[bot]
a179f0be61
build(deps): bump cachix/install-nix-action from 16 to 17
Bumps [cachix/install-nix-action](https://github.com/cachix/install-nix-action) from 16 to 17.
- [Release notes](https://github.com/cachix/install-nix-action/releases)
- [Commits](https://github.com/cachix/install-nix-action/compare/v16...v17)

---
updated-dependencies:
- dependency-name: cachix/install-nix-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-04-11 15:26:03 +00:00
Graham Christensen
c44d9d9e91
Merge pull request #1194 from fricklerhandwerk/contributing
add contributor's guide with architecture notes
2022-04-07 20:35:06 -04:00
Graham Christensen
05ce23aaa2
Merge pull request #1197 from DeterminateSystems/civetweb-ipv6-smile
flake: add ipv6 support to civetweb
2022-04-07 14:37:32 -04:00
Cole Helbling
be6077d2bb doc/manual: demonstrate ipv6 metrics address for queue-runner 2022-04-07 11:29:18 -07:00
Cole Helbling
3f303b479c flake: add ipv6 support to civetweb 2022-04-07 11:29:18 -07:00
Graham Christensen
6d9fac2577
Merge pull request #1196 from DeterminateSystems/fixup-metrics-addr-option-name
doc/manual: fixup configuration option name
2022-04-07 14:29:14 -04:00
Cole Helbling
ae690d6602 doc/manual: fixup configuration option name
Oops.
2022-04-07 10:40:50 -07:00
Graham Christensen
e1965250b5
Merge pull request #1173 from DeterminateSystems/queue-runner-exporter
hydra-queue-runner metrics
2022-04-07 12:27:33 -04:00
Cole Helbling
f8dc48f171
hydra-queue-runner: fixup: remove extraneous newline 2022-04-06 17:53:11 -07:00
Graham Christensen
59ac96a99c Track the number of steps created 2022-04-06 20:23:02 -04:00
Graham Christensen
1c12c5882f hydra queue runner: instrument the process of loading new builds with prom 2022-04-06 20:18:29 -04:00
Graham Christensen
5de08d412e queue metrics: refactor the metrics into a struct 2022-04-06 20:00:30 -04:00
Graham Christensen
46f52b4c4e bring back the working version Cole made 2022-04-06 15:49:38 -04:00
Cole Helbling
5bff730f2c WIP: I love it when they delete the assignment operator :) 2022-04-06 11:41:40 -07:00
Cole Helbling
15e8fa8aff doc/manual: document queue-runner prometheus exporter configuration 2022-04-06 11:41:40 -07:00
Cole Helbling
edf3c348f2 hydra-queue-runner: make entire address configurable 2022-04-06 10:59:45 -07:00
Cole Helbling
33bc60b83c hydra-queue-runner: move exporter back to State::run
It's (arguably) better than risking pinning the thread at 100% due to
the busy `while` loop.
2022-04-06 10:49:14 -07:00
fricklerhandwerk
0803634a41 add architecture notes
meeting notes from @edolstra giving a one-hour tour of the code
2022-04-06 09:01:10 +02:00
Eelco Dolstra
8cbf065468
Merge pull request #1195 from NixOS/update-nix
Update to Nix 2.8.0pre20220404_e496241
2022-04-05 20:12:56 +02:00
Eelco Dolstra
71a036ed00 Update to Nix master
Flake lock file updates:

• Updated input 'nix':
    'github:NixOS/nix/ec90fc4d1f42db3c5e3c74dc186487d10a28c221' (2022-04-05)
  → 'github:NixOS/nix/5fe4fe823c193cbb7bfa05a468de91eeab09058d' (2022-04-05)
• Updated input 'nix/nixpkgs':
    'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
  → 'github:NixOS/nixpkgs/530a53dcbc9437363471167a5e4762c5fcfa34a1' (2022-02-19)
2022-04-05 17:31:30 +02:00
Cole Helbling
8c5636fe18
hydra-queue-runner: use port 9198 by default
Co-authored-by: Graham Christensen <graham@grahamc.com>
2022-04-02 17:32:14 -07:00
Graham Christensen
3c181463f0
Merge pull request #1191 from ncfavier/nixos-search
Prepare for nixos-search integration
2022-03-31 16:27:43 -04:00
Naïm Favier
5e3374cb86
Prepare for nixos-search integration 2022-03-31 12:55:15 +02:00
Graham Christensen
84f54cb011
Merge pull request #1189 from NixOS/fix-hang
openConnection(): Don't throw exceptions in forked child
2022-03-30 17:16:02 -04:00
Eelco Dolstra
bcaad1c934 openConnection(): Don't throw exceptions in forked child
On hydra.nixos.org the queue runner had child processes that were
stuck handling an exception:

  Thread 1 (Thread 0x7f501f7fe640 (LWP 1413473) "bld~v54h5zkhmb3"):
  #0  futex_wait (private=0, expected=2, futex_word=0x7f50c27969b0 <_rtld_local+2480>) at ../sysdeps/nptl/futex-internal.h:146
  #1  __lll_lock_wait (futex=0x7f50c27969b0 <_rtld_local+2480>, private=0) at lowlevellock.c:52
  #2  0x00007f50c21eaee4 in __GI___pthread_mutex_lock (mutex=0x7f50c27969b0 <_rtld_local+2480>) at ../nptl/pthread_mutex_lock.c:115
  #3  0x00007f50c1854bef in __GI___dl_iterate_phdr (callback=0x7f50c190c020 <_Unwind_IteratePhdrCallback>, data=0x7f501f7fb040) at dl-iteratephdr.c:40
  #4  0x00007f50c190d2d1 in _Unwind_Find_FDE () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #5  0x00007f50c19099b3 in uw_frame_state_for () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #6  0x00007f50c190ab90 in uw_init_context_1 () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #7  0x00007f50c190b08e in _Unwind_RaiseException () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #8  0x00007f50c1b02ab7 in __cxa_throw () from /nix/store/dd8swlwhpdhn6bv219562vyxhi8278hs-gcc-10.3.0-lib/lib/libstdc++.so.6
  #9  0x00007f50c1d01abe in nix::parseURL (url="root@cb893012.packethost.net") at src/libutil/url.cc:53
  #10 0x0000000000484f55 in extraStoreArgs (machine="root@cb893012.packethost.net") at build-remote.cc:35
  #11 operator() (__closure=0x7f4fe9fe0420) at build-remote.cc:79
  ...

Maybe the fork happened while another thread was holding some global
stack unwinding lock
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744). Anyway, since
the hanging child inherits all file descriptors to SSH clients,
shutting down remote builds (via 'child.to = -1' in
State::buildRemote()) doesn't work and 'child.pid.wait()' hangs
forever.

So let's not do any significant work between fork and exec.
2022-03-30 22:39:48 +02:00
Graham Christensen
bbb0998699
Merge pull request #1188 from DeterminateSystems/nix-2.7
Nix 2.7
2022-03-29 15:49:37 -04:00
ajs124
089da272c7 fix build against nix 2.7.0
fix build after such commits as df552ff53e68dff8ca360adbdbea214ece1d08ee
and e862833ec662c1bffbe31b9a229147de391e801a
2022-03-29 15:38:24 -04:00
ajs124
c64c5f0a7e hydra-queue-runner: rename build-result.hh to hydra-build-result.hh 2022-03-29 15:34:29 -04:00
Graham Christensen
4368ff5d5b flake.lock: Add
Flake lock file updates:

• Added input 'nix':
    'github:NixOS/nix/ffe155abd36366a870482625543f9bf924a58281' (2022-03-07)
• Added input 'nix/lowdown-src':
    'github:kristapsdz/lowdown/d2c2b44ff6c27b936ec27358a2653caaef8f73b8' (2021-10-06)
• Added input 'nix/nixpkgs':
    'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
• Added input 'nix/nixpkgs-regression':
    'github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2' (2022-01-24)
• Added input 'nixpkgs':
    follows 'nix/nixpkgs'
2022-03-29 15:33:08 -04:00
Graham Christensen
98da457e16 nix: 2.7.0 2022-03-29 15:31:11 -04:00
Graham Christensen
20a8437094 flake.nix: set nix to 2.6.0 2022-03-29 15:29:33 -04:00
Graham Christensen
fd3690a0c1 flake.lock: Update
Flake lock file updates:

• Updated input 'nix':
    'github:NixOS/nix/a6ba313a0aac3b6e2fef434cb42d190a0849238e' (2021-08-10)
  → 'github:NixOS/nix/a1cd7e58606a41fcf62bf8637804cf8306f17f62' (2022-01-24)
• Updated input 'nix/lowdown-src':
    'github:kristapsdz/lowdown/148f9b2f586c41b7e36e73009db43ea68c7a1a4d' (2021-04-03)
  → 'github:kristapsdz/lowdown/d2c2b44ff6c27b936ec27358a2653caaef8f73b8' (2021-10-06)
• Updated input 'nix/nixpkgs':
    'github:NixOS/nixpkgs/f77036342e2b690c61c97202bf48f2ce13acc022' (2021-06-28)
  → 'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
• Added input 'nix/nixpkgs-regression':
    'github:NixOS/nixpkgs/215d4d0fd80ca5163643b03a33fde804a29cc1e2' (2022-01-24)
2022-03-29 15:29:23 -04:00
Graham Christensen
3b048ed136 Revert "Revert "Use copyClosure instead of computeFSClosure + copyPaths""
This reverts commit 8e3ada2afc.
2022-03-29 15:28:47 -04:00
Cole Helbling
9c1f36c47c t/lib/HydraTestContext: set queue runner port to 0
This makes the exposer choose a random, available port.
2022-03-29 11:41:23 -07:00
Cole Helbling
4789eba92c hydra-queue-runer: split metrics functionality into its own function 2022-03-29 10:55:28 -07:00