Commit graph

2904 commits

Author SHA1 Message Date
Cole Helbling 33bc60b83c hydra-queue-runner: move exporter back to State::run
It's (arguably) better than risking pinning the thread at 100% due to
the busy `while` loop.
2022-04-06 10:49:14 -07:00
Eelco Dolstra 71a036ed00 Update to Nix master
Flake lock file updates:

• Updated input 'nix':
    'github:NixOS/nix/ec90fc4d1f42db3c5e3c74dc186487d10a28c221' (2022-04-05)
  → 'github:NixOS/nix/5fe4fe823c193cbb7bfa05a468de91eeab09058d' (2022-04-05)
• Updated input 'nix/nixpkgs':
    'github:NixOS/nixpkgs/82891b5e2c2359d7e58d08849e4c89511ab94234' (2021-09-28)
  → 'github:NixOS/nixpkgs/530a53dcbc9437363471167a5e4762c5fcfa34a1' (2022-02-19)
2022-04-05 17:31:30 +02:00
Cole Helbling 8c5636fe18
hydra-queue-runner: use port 9198 by default
Co-authored-by: Graham Christensen <graham@grahamc.com>
2022-04-02 17:32:14 -07:00
Eelco Dolstra bcaad1c934 openConnection(): Don't throw exceptions in forked child
On hydra.nixos.org the queue runner had child processes that were
stuck handling an exception:

  Thread 1 (Thread 0x7f501f7fe640 (LWP 1413473) "bld~v54h5zkhmb3"):
  #0  futex_wait (private=0, expected=2, futex_word=0x7f50c27969b0 <_rtld_local+2480>) at ../sysdeps/nptl/futex-internal.h:146
  #1  __lll_lock_wait (futex=0x7f50c27969b0 <_rtld_local+2480>, private=0) at lowlevellock.c:52
  #2  0x00007f50c21eaee4 in __GI___pthread_mutex_lock (mutex=0x7f50c27969b0 <_rtld_local+2480>) at ../nptl/pthread_mutex_lock.c:115
  #3  0x00007f50c1854bef in __GI___dl_iterate_phdr (callback=0x7f50c190c020 <_Unwind_IteratePhdrCallback>, data=0x7f501f7fb040) at dl-iteratephdr.c:40
  #4  0x00007f50c190d2d1 in _Unwind_Find_FDE () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #5  0x00007f50c19099b3 in uw_frame_state_for () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #6  0x00007f50c190ab90 in uw_init_context_1 () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #7  0x00007f50c190b08e in _Unwind_RaiseException () from /nix/store/65hafbsx91127farbmyyv4r5ifgjdg43-glibc-2.33-117/lib/libgcc_s.so.1
  #8  0x00007f50c1b02ab7 in __cxa_throw () from /nix/store/dd8swlwhpdhn6bv219562vyxhi8278hs-gcc-10.3.0-lib/lib/libstdc++.so.6
  #9  0x00007f50c1d01abe in nix::parseURL (url="root@cb893012.packethost.net") at src/libutil/url.cc:53
  #10 0x0000000000484f55 in extraStoreArgs (machine="root@cb893012.packethost.net") at build-remote.cc:35
  #11 operator() (__closure=0x7f4fe9fe0420) at build-remote.cc:79
  ...

Maybe the fork happened while another thread was holding some global
stack unwinding lock
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71744). Anyway, since
the hanging child inherits all file descriptors to SSH clients,
shutting down remote builds (via 'child.to = -1' in
State::buildRemote()) doesn't work and 'child.pid.wait()' hangs
forever.

So let's not do any significant work between fork and exec.
2022-03-30 22:39:48 +02:00
ajs124 089da272c7 fix build against nix 2.7.0
fix build after such commits as df552ff53e68dff8ca360adbdbea214ece1d08ee
and e862833ec662c1bffbe31b9a229147de391e801a
2022-03-29 15:38:24 -04:00
ajs124 c64c5f0a7e hydra-queue-runner: rename build-result.hh to hydra-build-result.hh 2022-03-29 15:34:29 -04:00
Graham Christensen 3b048ed136 Revert "Revert "Use copyClosure instead of computeFSClosure + copyPaths""
This reverts commit 8e3ada2afc.
2022-03-29 15:28:47 -04:00
Cole Helbling 4789eba92c hydra-queue-runer: split metrics functionality into its own function 2022-03-29 10:55:28 -07:00
Cole Helbling 928b3b8268 hydra-queue-runner: fix priority of flag over config file 2022-03-29 10:42:07 -07:00
Cole Helbling 5ddb9a98ca fixup! hydra-queue-runner: log message before and after exporter is started 2022-03-29 08:47:41 -07:00
Cole Helbling 905a7a7beb hydra-queue-runner: read metrics port from queue_runner_metrics_port config 2022-03-29 08:46:43 -07:00
Cole Helbling 9cdc5aceed hydra-queue-runner: log message before and after exporter is started
This way, if something goes wrong between the two, it's easier to narrow
down where the issue lies.
2022-03-29 08:41:19 -07:00
Cole Helbling 8e3ada2afc Revert "Use copyClosure instead of computeFSClosure + copyPaths"
This reverts commit f14c583ce5.
2022-03-28 09:54:02 -07:00
Eelco Dolstra 962bf36939
Merge pull request #1162 from obsidiansystems/less-ref
Make `copyClosureTo` take a regular C++ ref to the store
2022-03-23 16:25:59 +01:00
Eelco Dolstra 3390415905
Merge pull request #1125 from obsidiansystems/simplify--copyClosure
Use `copyClosure` instead of `computeFSClosure` + `copyPaths`
2022-03-23 12:49:22 +01:00
Cole Helbling 8503a7917b fixup! hydra-queue-runner: make registry member of State, configurable metrics port 2022-03-22 13:38:13 -07:00
Graham Christensen e5393c2cf8 fixup: make id non-ambiguous 2022-03-19 23:56:47 -04:00
Graham Christensen 137be3452e Reduce the jobset cols on the remaining two queries 2022-03-19 23:56:47 -04:00
Graham Christensen f353a7ac41 update-gc-roots: try subselecting the jobset table 2022-03-19 23:56:47 -04:00
Graham Christensen 145667cb53 hydra-update-gc-roots: allow cached refs to the build's jobset
Re-executing this search_related on every access turned out to
create very problematic performance. If a jobset had a lot of
error output stored in the jobset, and there were many hundreds
or thousands of active jobs, this could easily cause >1Gbps of
network traffic.
2022-03-19 23:56:47 -04:00
Graham Christensen a582e4c485 HydraTestContext: add \n's to various dies 2022-03-19 14:46:53 -04:00
Graham Christensen 074a2f96bf hydra-eval-jobset: emit a useful error if constituents errored 2022-03-19 14:37:12 -04:00
Cole Helbling c0f826b92d hydra-queue-runner: get the listening port from the exposer itself
Otherwise, when the port is randomly chosen (e.g. by specifying no port,
or a port of 0), it will just show that the port is 0 and not the port
that is actually serving the metrics.
2022-03-14 08:41:45 -07:00
Cole Helbling 52a29d43e6 hydra-queue-runner: make registry member of State, configurable metrics port
Thanks to the updated prometheus-cpp library, specifying a port of 0
will cause it to pick a random (available) port -- ideal for tests.
2022-03-11 11:58:10 -08:00
Cole Helbling 3bf31bd6a6 hydra-queue-runner: add simple "up" exporter
There are probably better ways to achieve this (and will likely need to
be refactored a bit to support further metrics).
2022-03-10 12:36:58 -08:00
Graham Christensen 9316544abf
src/hydra-eval-jobs/hydra-eval-jobs.cc: .get<std::string> for drvPath
Co-authored-by: Kayla Fire <firestack@users.noreply.github.com>
2022-02-21 12:41:21 -05:00
Graham Christensen 290e0653ad hydra-eval-jobs: GC root aggregate jobs 2022-02-20 12:28:40 -05:00
John Ericson 445bba337b Make copyClosureTo take a regular C++ ref to the store
This is syntactically lighter wait, and demonstates there are no weird
dynamic lifetimes involved, just regular passing reference to callee
which it only borrows for the duration of the call.
2022-02-20 17:22:43 +00:00
John Ericson f14c583ce5 Use copyClosure instead of computeFSClosure + copyPaths
It is more terse, and in the future it is possible `copyClosure` will
become more sophisticated.
2022-02-19 11:59:17 -05:00
Graham Christensen 4c41ca08e1
Merge pull request #1155 from helsinki-systems/fix/graph-readability
build-graphs: Fix readability in dark mode
2022-02-14 11:27:37 -05:00
ajs124 1c84676527
Fit more content on screen 2022-02-13 18:33:37 +01:00
Janne Heß 6d146deaf0
build-graphs: Fix readability in dark mode 2022-02-13 14:00:17 +01:00
Graham Christensen 27ddde1e9e dynamic runcommand: print a notice on the build page if it is disabled 2022-02-11 15:04:54 -05:00
Cole Helbling a22a8fa62d AddBuilds: reject declarative jobsets with dynamic runcommand enabled if disabled elsewhere 2022-02-11 14:35:52 -05:00
Cole Helbling 928ba9e854 Controller/{Jobset,Project}: error when enabling dynamic runcommand but it's disabled elsewhere 2022-02-11 14:35:52 -05:00
Cole Helbling d680c209fe edit-project.tt: disable when disabled by server
Also add a tooltip describing why it's disabled, to make it easier to
chase down.
2022-02-11 14:35:52 -05:00
Cole Helbling 6053e5fd4b edit-jobset.tt: disable when disabled by project and server
Also add a tooltip describing why it's disabled, to make it easier to
chase down.
2022-02-11 14:35:52 -05:00
Cole Helbling dfd3a67424 project.tt: more info on why Dynamic RunCommand is disabled 2022-02-11 14:35:52 -05:00
Cole Helbling 3f4f183792 jobset.tt: more info on why Dynamic RunCommand is disabled 2022-02-11 14:35:52 -05:00
Graham Christensen 71c06f2ce7 LDAP normalization errors: note that the error came while normalizing the roles. 2022-02-11 10:55:27 -05:00
Graham Christensen f07fb7d279 LDAP support: include BC support for the YAML based loading
Includes a refactoring of the configuration loader.
2022-02-11 10:49:38 -05:00
Janne Heß 61d74a7194 Redo LDAP config in the main configuration and add role mappings 2022-02-11 10:49:38 -05:00
Graham Christensen d0bc0d0eda
Merge pull request #1152 from DeterminateSystems/parallel-tests
Parallel tests, fix a hydra-queue-runner race condition
2022-02-10 12:11:20 -05:00
Graham Christensen 4acaf9c8b0 hydra-queue-runner: don't dispatch until the machines parser has completed one run
Periodically, I have seen tests fail because of out of order queue runner behavior:

    checking the queue for builds > 0...
    loading build 1 (tests:basic:empty_dir)
    aborting unsupported build step '...-empty-dir.drv' (type 'x86_64-linux')
    marking build 1 as failed
    adding new machine ‘localhost’

This patch should prevent the dispatcher from running before any machines are
made available.
2022-02-10 10:54:30 -05:00
Graham Christensen 9ae7c8bddc Hydra::Helper::Exec add an expectOkay which dies with stdout / stderr on exit 2022-02-09 20:56:10 -05:00
Graham Christensen 845e6d4760 captureStdoutStderr*: move to Hydra::Helper::Exec which helps avoid some environment variable fixation problems 2022-02-09 14:28:50 -05:00
Graham Christensen 517dce285a eval_added event: change interface to traceID\tjobsetID\tevaluationID
I was not going to break the interface until I noticed
the current implementation uses the string literal \t.
2022-02-08 09:51:35 -05:00
Graham Christensen d512e6220f eval_failed event: change interface to traceID\tjobsetID
I was not going to break the interface until I noticed the other eval_* events used literal \ts
2022-02-08 09:51:35 -05:00
Graham Christensen 2597fa8c11 eval_cached event: change interface to traceID\tjobsetID\tevaluationID
I was not going to break the interface until I noticed
the current implementation uses the string literal \t.
2022-02-08 09:51:35 -05:00
Graham Christensen c30f084f32 eval_started event: change interface to traceID\tjobsetID
I was not going to break the interface until I noticed
the current implementation uses the string literal \t.
2022-02-08 09:51:35 -05:00
Graham Christensen 8a18326f2b Sort notification classes / events 2022-02-07 16:08:27 -05:00
Graham Christensen d8b56f022d RunCommand: print a warning if the hook isn't run because the project / jobset doens't have it enabled 2022-02-01 10:58:54 -05:00
Graham Christensen 3aa2393091 Jobsets: add a supportsDynamicRunCommand which also checks the project's dynamic runcommand support 2022-02-01 10:58:54 -05:00
Graham Christensen daa6864a58 Project result: add a supportsDynamicRunCommand helper 2022-02-01 10:58:54 -05:00
Graham Christensen bc1630bd27 fixup! RunCommand: Add a WIP execution of dynamic commands 2022-02-01 10:58:54 -05:00
Graham Christensen 8a96f07f58 Project: enable enabling dynamic runcommand per project 2022-02-01 10:58:54 -05:00
Graham Christensen 1affb1cfb1 jobset API: expose and check the enable_dynamic_run_command 2022-02-01 10:58:54 -05:00
Graham Christensen 726ea80e99 HTTP/Jobset: support setting / reading enable_dynamic_run_command 2022-02-01 10:58:54 -05:00
Graham Christensen 1802bd0113 Declarative Jobs: add support for the enable_dynamic_run_command flag 2022-02-01 10:58:54 -05:00
Graham Christensen 0810f5debc finish making the dynamic hooks only run on project & jobset agreement 2022-02-01 10:58:54 -05:00
Graham Christensen aef11685a0 regenerate schema files after adding the flag to the projects 2022-02-01 10:58:54 -05:00
Graham Christensen 85a53694c8 sql: add enable_dynamic_run_command to the Project as well 2022-02-01 10:58:54 -05:00
Graham Christensen a9bfabd672 sql: add a migration for enable_dynamic_run_command 2022-02-01 10:58:23 -05:00
Graham Christensen 3cce0c5ef6 Only run dynamic runcommand hooks if the jobset enables them 2022-02-01 10:57:30 -05:00
Graham Christensen 97a1d2d1d4 Jobsets: add enable_dynamic_run_command 2022-02-01 10:57:30 -05:00
Graham Christensen 216d8bee35 DynamicRunCommand: don't run if the build failed 2022-02-01 10:57:30 -05:00
Graham Christensen 1a30a0c2f1 Dynamic RunCommand: validate that the job's out exists, is a file (or points to a file) which is executable. 2022-02-01 10:57:30 -05:00
Graham Christensen e7f68045f4 DynamicRunCommand: pull out the function determining if a build is
eligible for execution under dynamic run commands.
2022-02-01 10:57:30 -05:00
Graham Christensen e56c49333f RunCommand: Add a WIP execution of dynamic commands
This in-progress feature will run a dynamically generated set of
buildFinished hooks, which must be nested under the `runCommandHook.*`
attribute set. This implementation is not very good, with some to-dos:

1. Only run if the build succeeded
2. Verify the output is named $out and that it is an executable file
   (or a symlink to a file)
3. Require the jobset itself have a flag enabling the feature, since
   this feature can be a bit dangerous if various people of different
   trust levels can create the jobs.
2022-02-01 10:57:30 -05:00
Graham Christensen ea311a0eb4 RunCommand: enable the plugin if dynamicruncommand is set 2022-02-01 10:57:30 -05:00
Graham Christensen 85b842e0ac
Merge pull request #1137 from DeterminateSystems/runcommand-logs
Store and display the output of RunCommands
2022-01-31 16:26:31 -05:00
Cole Helbling b57345ba1f hydra.sql: add IndexRunCommandLogsOnBuildID index 2022-01-31 12:56:34 -08:00
Cole Helbling d0b6329aa8 sql/upgrade-81: remove unnecessary comment 2022-01-31 12:55:36 -08:00
Cole Helbling 8c67e32480 RunCommand: ensure we reset the umask 2022-01-31 12:55:36 -08:00
Cole Helbling 34e4c119f4 build.tt: don't duplicate RunCommandLog buttons 2022-01-31 11:40:16 -08:00
Cole Helbling 61189ecca9 Helper/Nix: constructRunCommandLogPath: verify uuid is valid
This shouldn't be possible normally, but it is possible to:

    $db->resultset('RunCommandLogs')->new({ uuid => "../etc/passwd" });

if you have access to the `$db`.
2022-01-31 08:58:33 -08:00
Cole Helbling e381751564 Helper/Nix: constructRunCommandLogPath: return undef in case of an error
This allows us to give a web request to an invalid UUID a 404.
2022-01-31 08:58:33 -08:00
Cole Helbling 8eab7b8543 Helper/Nix: constructRunCommandLogPath: take RunCommandLog as input
This way we ensure that it actually exists in the database, rather than
blindly trusting user-generated input.
2022-01-31 08:58:33 -08:00
Cole Helbling 61914d56c6 runcommand-log.tt: escape the command 2022-01-31 08:58:33 -08:00
Cole Helbling 71bbb042db build.tt: link to the pretty, raw, and tail versions of the log
Also split it out to a new div -- there are now 3 lines per
RunCommandLog -- the first saying when it started, the second saying how
long it ran for (or has been running), and the third with the buttons
for the pretty, raw, and tail versions of the log.
2022-01-31 08:58:33 -08:00
Cole Helbling 3594ba942a Controller/Build: use showLog in view_runcommandlog
This also adds the `runcommandlog` object to the stash so that we can
access its uuid as well as command run in order to display more useful
and specific information on the webpage.
2022-01-31 08:58:33 -08:00
Cole Helbling 1d0076408b Controller/Build: pass log_uri to showLog in place of drvPath
This way, we can reuse the `showLog` sub for other things, such as
`view_runcommandlog` (which doesn't have a drvPath attached).
2022-01-31 08:58:33 -08:00
Cole Helbling ff390e89a6 Controller/Build: remove unused parameter from showLog 2022-01-31 08:58:33 -08:00
Cole Helbling fc3cf4ecb2 RunCommandLogs: identify and access via uuid
Using a sha1 of the command combined with the build ID is not a
particularly good or unique identifier:

* A build could fail, be restarted, and then succeed -- assuming no
configuration changes, the sha1 hash of the command as well as the build
ID will be the same. This would lead to an overwritten log file.

* Allowing user input to influence filenames is not the best of ideas.
2022-01-31 08:58:33 -08:00
Graham Christensen dcb0c1425c RunCommandLogs: set a UUID automatically 2022-01-31 08:58:33 -08:00
Graham Christensen cf49a05ff5 RunCommandLogs: add a uuid to each log entry 2022-01-31 08:58:33 -08:00
Graham Christensen 94ed9ed7ff
Merge pull request #1136 from DeterminateSystems/github-status-cached-evals
GithubStatus: try pushing statuses for cached buildqueued/buildfinished events
2022-01-31 09:11:37 -05:00
Cole Helbling 244300c1ad RunCommand: remove unused and problematic imports
Since breaking the filename construction out to a helper function,
Hydra::Model::DB is no longer used. Importing Hydra::Helper::Nix,
however, has the potential to break tests, so just use the functions we
need without importing the entire module.
2022-01-28 13:07:11 -08:00
Cole Helbling fdf6f4d3da RunCommand: use IPC::Run3::run3 instead
run3 just seems to do better handling for what we want to do, and
requires less deep-reaching changes to this plugin to get it to play
nice, as IPC::Run::run would.
2022-01-28 13:07:11 -08:00
Cole Helbling 3432cd7636 build.tt: split runcommand logic across multiple lines
Helps with readability.
2022-01-28 13:07:11 -08:00
Cole Helbling 1554750acc RunCommand: use make_path over mkdir
This will make all necessary parent directories a la `mkdir -p`.
2022-01-28 13:03:15 -08:00
Cole Helbling bf3c46ed43 RunCommand: use IPC::Run to spawn the command
This allows `logPath`s with spaces and other characters that might
otherwise cause problems inside a `system()` call.
2022-01-28 13:03:15 -08:00
Cole Helbling bb16f4fb10 RunCommand: set umask when creating log paths
This uses the somewhat restrictive umask of 0027 so that people outside
the user or group cannot read the files. This also helps to inhibit
TOCTOU where someone else has a handle to our file before we chmod it
and after we close it.
2022-01-28 13:03:15 -08:00
Cole Helbling 5d3912962b RunCommand: use helper functions to ensure filenames and paths are the same
Otherwise, it's possible someone updates the format in one place but not
the others, leading to broken or incorrect functionality.
2022-01-28 13:03:15 -08:00
Cole Helbling 14090fbb86 runcommand-log.tt: init 2022-01-28 13:03:15 -08:00
Janne Heß 796ce165d4 RunCommand: Allow displaying command output 2022-01-28 13:03:15 -08:00
Janne Heß 4cb5e6cd94 RunCommand: Capture the output of the commands 2022-01-28 13:00:17 -08:00
Graham Christensen ef362e92d1 GithubStatus: try pushing statuses for cached buildqueued/buildfinished events 2022-01-25 12:42:28 -05:00
Graham Christensen f6e86efc9f
Merge pull request #1091 from Ma27/ssh-remote-store-location
hydra-queue-runner: support store URIs declaring an alternate store location
2022-01-24 14:10:54 -05:00
Graham Christensen 3a4ea6e563
Merge pull request #1124 from obsidiansystems/simplify--closure-of-path-set
simplify, `computeFSClosure` can take a set now
2022-01-24 14:09:35 -05:00