Commit graph

1726 commits

Author SHA1 Message Date
Eelco Dolstra 06d75699a7 Fix restarting a build 2015-07-10 16:56:53 +02:00
Eelco Dolstra 7f865a30d5 hydra-evaluator: Fix input change check
Because inputs were processed in random order by inputsToArgs, the
inputs hash could be different every time, leading to unnecessary
re-evaluations.
2015-07-10 16:44:06 +02:00
Eelco Dolstra 3e7bbec40b hydra-evaluator: Send statistics to statsd 2015-07-10 16:40:50 +02:00
Eelco Dolstra 5919e911db Don't show how long a machine has been idle
Without an index on (machine, stoptime desc), this requires a
sequential scan. And adding a whole index for this seems
overkill. (Possibly the queue runner could maintain this info more
efficiently.)
2015-07-10 15:41:57 +02:00
Eelco Dolstra 3bb9e17e5c IndexJobsetEvalsOnJobsetId: Only index jobsets with new builds 2015-07-10 15:41:57 +02:00
Eelco Dolstra b09f7e0989 Add page showing latest build steps 2015-07-10 15:41:57 +02:00
Eelco Dolstra 0da08df4eb Stream logs if possible and remove size limit 2015-07-08 19:05:17 +02:00
Eelco Dolstra d8acaf2181 Index BuildSteps on propagatedFrom
This significantly speeds up deleting Builds, since it removes the
need for a sequential scan on BuildSteps.
2015-07-08 12:12:44 +02:00
Eelco Dolstra bbee81efae Use triggers for all notifications on Builds table changes 2015-07-08 12:05:32 +02:00
Eelco Dolstra 89fb723ace Notify the queue runner when a build is deleted 2015-07-08 11:43:35 +02:00
Eelco Dolstra 95c4294560 Allow cancelling builds marked as busy
Note that if there are active build *steps*, this won't cancel them.
2015-07-07 14:08:46 +02:00
Eelco Dolstra 35b7c4f82b Allow only 1 thread to send a closure to a given machine at the same time
This prevents a race where multiple threads see that machine X is
missing path P, and start sending it concurrently. Nix handles this
correctly, but it's still wasteful (especially for the case where P ==
GHC).

A more refined scheme would be to have per machine, per path locks.
2015-07-07 14:06:48 +02:00
Eelco Dolstra 16696a4aee Namespace cleanup 2015-07-07 10:29:43 +02:00
Eelco Dolstra 63745b8e25 Move buildRemote() into State 2015-07-07 10:25:33 +02:00
Eelco Dolstra df29527531 Refactor 2015-07-07 10:17:21 +02:00
Eelco Dolstra dd4f6e695e Merge branch 'master' into build-ng 2015-07-06 17:17:51 +02:00
Eelco Dolstra ccf6e6062c Store full Mercurial revision hashes 2015-07-06 17:17:17 +02:00
Eelco Dolstra 309ef5baa9 Merge branch 'master' into build-ng 2015-07-06 15:57:09 +02:00
Eelco Dolstra b85e9ef1cd Support using Git revisions as branch names 2015-07-06 15:56:24 +02:00
Eelco Dolstra b03de925cb Allow a jobset to be created from an evaluation
Fixes #150.
2015-07-06 15:56:20 +02:00
Eelco Dolstra dffb629b8a Unify Hydra's NixOS module with the one used for hydra.nixos.org
In particular, the queue runner and web server now run under different
UIDs.
2015-07-02 01:01:44 +02:00
Eelco Dolstra 3e0f5f664a GitInput plugin: Don't clone during getCommits
This doesn't work if hydra-queue-runner has no write access to the scm
directory, and in any case races with the evaluator.
2015-07-02 00:44:40 +02:00
Eelco Dolstra ae52fc7f61 Remove display of queue runner log file (it no longer exists) 2015-07-02 00:18:33 +02:00
Eelco Dolstra e35b704d80 Drop the 5 minute minimum interval between triggered evals 2015-07-01 14:45:39 +02:00
Eelco Dolstra 85a1ce99c9 Only include Persona JS when Persona is enabled 2015-07-01 14:24:18 +02:00
Eelco Dolstra 3c665dac82 Remove superfluous HYDRA_LOGO environment variable 2015-07-01 11:34:19 +02:00
Eelco Dolstra 7e6135a8c6 Don't repeat links to build step logs
Hydra only stores the last log for a particular derivation, so only
show log links for the last one.
2015-06-30 00:27:31 +02:00
Eelco Dolstra 2ece42b2b9 Support preferLocalBuild
Derivations with "preferLocalBuild = true" can now be executed on
specific machines (typically localhost) by setting the mandary system
features field to include "local". For example:

  localhost x86_64-linux,i686-linux - 10 100 - local

says that "localhost" can *only* do builds with "preferLocalBuild =
true". The speed factor of 100 will make the machine almost always win
over other machines.
2015-06-30 00:20:19 +02:00
Eelco Dolstra 008d610467 getQueuedBuilds(): Don't catch errors while loading a build from the queue
Otherwise we never recover from reset daemon connections, e.g.

  hydra-queue-runner[16106]: while loading build 599369: cannot start daemon worker: reading from file: Connection reset by peer
  hydra-queue-runner[16106]: while loading build 599236: writing to file: Broken pipe
  ...

The error is now handled queueMonitor(), causing the next call to
queueMonitorLoop() to create a new connection.
2015-06-26 21:06:35 +02:00
Eelco Dolstra f5e5a1b96e Don't wake up the queue runner for cached evals 2015-06-26 20:59:14 +02:00
Eelco Dolstra 401f5bdce2 Add a unit for hydra-send-stats 2015-06-26 15:24:12 +02:00
Eelco Dolstra 9a041f9a36 Restart builds failed due to unsupported system type 2015-06-26 11:28:38 +02:00
Eelco Dolstra 2f4676bd97 JSONObject doesn't handle 64-bit integers 2015-06-25 16:59:48 +02:00
Eelco Dolstra c54a04688e Fix email sender address when notification_sender is not set 2015-06-25 16:49:01 +02:00
Eelco Dolstra c6fcce3b3b Moar stats 2015-06-25 16:47:39 +02:00
Eelco Dolstra 18a3c3ff1c Update "make check" for the new queue runner
Also, if the machines file contains an entry for localhost, then run
"nix-store --serve" directly, without going through SSH.
2015-06-25 16:47:39 +02:00
Eelco Dolstra 32210905d8 Automatically reload $NIX_REMOTE_SYSTEMS when it changes
Otherwise, you'd have to restart the queue runner to add or remove
machines.
2015-06-25 16:47:25 +02:00
Eelco Dolstra 1a0e1eb5a0 More stats 2015-06-24 13:19:27 +02:00
Eelco Dolstra 3f8891b6ff Fix incorrect debug message 2015-06-23 17:53:15 +02:00
Eelco Dolstra 62219adaf3 Send queue runner stats to statsd
This is currently done by a separate program that periodically
calls "hydra-queue-runner --status". Eventually, I'll do this
in the queue runner directly.

Fixes #220.
2015-06-23 14:56:43 +02:00
Eelco Dolstra af5cbe97aa createStep(): Cache finished derivations
This gets rid of a lot of redundant calls to readDerivation().
2015-06-23 03:25:31 +02:00
Eelco Dolstra 681f63a382 Typo 2015-06-23 02:15:11 +02:00
Eelco Dolstra 524ee295e0 Fix sending notifications in the successful case 2015-06-23 02:13:06 +02:00
Eelco Dolstra 4db7c51b5c Rate-limit the number of threads copying closures at the same time
Having a hundred threads doing I/O at the same time is bad on magnetic
disks because of the excessive disk seeks. So allow only 4 threads to
copy closures in parallel.
2015-06-23 01:49:14 +02:00
Eelco Dolstra a317d24b29 hydra-queue-runner: Send build notifications
Since our notification plugins are written in Perl, sending
notification from C++ requires a small Perl helper named
‘hydra-notify’.
2015-06-23 00:14:49 +02:00
Eelco Dolstra 5312e1209b Keep per-machine stats 2015-06-22 17:11:17 +02:00
Eelco Dolstra d06366e7cf Remove obsolete comment 2015-06-22 16:59:50 +02:00
Eelco Dolstra e069ee960e Doh 2015-06-22 16:58:40 +02:00
Eelco Dolstra e32ee3d5b9 Remove hydra-build and the old hydra-queue-runner 2015-06-22 15:43:15 +02:00
Eelco Dolstra 41ba7418e2 hydra-queue-runner: More stats 2015-06-22 15:34:33 +02:00
Eelco Dolstra 62b53a0a47 Guard against concurrent invocations of hydra-queue-runner 2015-06-22 14:24:03 +02:00
Eelco Dolstra fbd7c02217 Periodically dump/log status 2015-06-22 14:15:43 +02:00
Eelco Dolstra 4f4141e1db Add command ‘hydra-queue-runner --status’ to show current status 2015-06-22 14:06:44 +02:00
Eelco Dolstra 44a2b74f5a Keep track of the number of build steps that are being built
(As opposed to being in the closure copying stage.)
2015-06-22 11:23:00 +02:00
Eelco Dolstra fed71d3fe9 Move "created" field into Step::State 2015-06-22 11:07:52 +02:00
Eelco Dolstra 90a08db241 hydra-queue-runner: Fix assertion failure 2015-06-22 10:59:07 +02:00
Eelco Dolstra d744362e4a hydra-queue-runner: Fix segfault sorting machines by load
While sorting machines by load, the load of a machine
(machine->currentJobs) can be changed by other threads. If that
happens, the comparator is no longer a proper ordering, in which case
std::sort() can segfault. So we now make a copy of currentJobs before
sorting.
2015-06-21 16:21:42 +02:00
Eelco Dolstra a0eff6fc15 Fix machine selection 2015-06-19 17:45:26 +02:00
Eelco Dolstra 81abb6e166 Improve parsing of hydra-build-products 2015-06-19 17:20:20 +02:00
Eelco Dolstra e13477bdf2 Robustness 2015-06-19 16:35:49 +02:00
Eelco Dolstra f196967c43 Don't create a propagated build step to the same build 2015-06-19 15:33:37 +02:00
Eelco Dolstra 7afc61691b Doh 2015-06-19 15:27:49 +02:00
Eelco Dolstra 133d298e26 Asynchronously compress build logs 2015-06-19 15:06:12 +02:00
Eelco Dolstra 8e408048e2 Create build step for non-top-level cached failures
This fixes the missing build step on failures like

  http://hydra.nixos.org/build/23222231
2015-06-19 11:33:15 +02:00
Eelco Dolstra 77c8bfd392 Improve logging for aborts 2015-06-19 10:37:22 +02:00
Eelco Dolstra 8db1ae2855 Less verbosity 2015-06-18 17:43:13 +02:00
Eelco Dolstra 89b629eeb1 Fix finishing steps that are not top-level of any build 2015-06-18 17:37:35 +02:00
Eelco Dolstra 9cdbff2fdf Handle concurrent finishing of the same build
There is a slight possibility that the queue monitor and a builder
thread simultaneously decide to mark a build as finished. That's fine,
as long as we ensure the DB update is idempotent (as ensured by doing
"update Builds set finished = 1 ... where finished = 0").
2015-06-18 17:12:51 +02:00
Eelco Dolstra 948473c909 Fix race between the queue monitor and the builder threads 2015-06-18 16:30:28 +02:00
Eelco Dolstra 9c03b11ca8 Simplify retry handling 2015-06-18 14:51:50 +02:00
Eelco Dolstra e039f5f840 Create failed build steps for cached failures 2015-06-18 04:35:37 +02:00
Eelco Dolstra 92ea800cfb Set finishedInDB in a few more places 2015-06-18 04:19:21 +02:00
Eelco Dolstra 47367451c7 hydra-queue-runner: Set isCachedBuild 2015-06-18 03:28:58 +02:00
Eelco Dolstra 8257812d0a Acquire exclusive table lock earlier 2015-06-18 02:44:29 +02:00
Eelco Dolstra 69be3cfe93 hydra-queue-runner: Handle status queries on the main thread
Doing it on the queue monitor thread was problematic because
processing the queue can take a while.
2015-06-18 01:57:01 +02:00
Eelco Dolstra a40ca6b76e hydra-queue-runner: Improve dispatcher
We now take the machine speed factor into account, just like
build-remote.pl.
2015-06-18 01:52:20 +02:00
Eelco Dolstra 3855131185 hydra-queue-runner: Improve SSH flags 2015-06-18 00:50:48 +02:00
Eelco Dolstra f57d0b0c54 hydra-queue-runner: Maintain count of active build steps 2015-06-18 00:24:56 +02:00
Eelco Dolstra 59dae60558 hydra-queue-runner: More stats 2015-06-17 22:38:12 +02:00
Eelco Dolstra ec8e8edc86 hydra-queue-runner: Handle $HYDRA_DBI 2015-06-17 22:11:01 +02:00
Eelco Dolstra 4d9c74335d Add forgotten file 2015-06-17 21:39:28 +02:00
Eelco Dolstra ce9e859a9c hydra-queue-runner: Implement --unlock 2015-06-17 21:35:20 +02:00
Eelco Dolstra ca48818b30 Fix remote building 2015-06-17 17:28:59 +02:00
Eelco Dolstra 11be780948 Handle failure with output 2015-06-17 17:11:42 +02:00
Eelco Dolstra b1a75c7f63 getQueuedBuilds(): Handle dependent builds first
If a build A depends on a derivation that is the top-level derivation
of some build B, then we should process B before A (meaning we
shouldn't make the derivation runnable before B has been
added). Otherwise, the derivation will be "accounted" to A rather than
B (so the build step will show up in the wrong build).
2015-06-17 14:46:02 +02:00
Eelco Dolstra c6d504edbb Handle SSH hosts without a @ 2015-06-17 13:49:18 +02:00
Eelco Dolstra 745efce828 hydra-queue-runner: Implement timeouts
Also, keep track of timeouts in the database as a distinct build
status.
2015-06-17 13:32:33 +02:00
Eelco Dolstra 2da4987bc2 Don't lock the CPU 2015-06-17 11:48:38 +02:00
Eelco Dolstra b91a616520 Automatically retry aborted builds
Aborted builds are now put back on the runnable queue and retried
after a certain time interval (currently 60 seconds for the first
retry, then tripled on each subsequent retry).
2015-06-17 11:45:20 +02:00
Eelco Dolstra e02654b3a0 Prefer cached failure over unsupported system type 2015-06-16 18:00:39 +02:00
Eelco Dolstra a984c0badc Merge branch 'master' into build-ng 2015-06-15 18:21:07 +02:00
Eelco Dolstra 42e7301c08 Add status dump facility
Doing

  $ psql hydra -c 'notify dump_status'

will cause hydra-queue-runner to dump some internal status info on
stderr.
2015-06-15 18:20:14 +02:00
Eelco Dolstra dd104f14ea Make the queue monitor more robust, and better debug output 2015-06-15 16:54:52 +02:00
Eelco Dolstra 147eb4fd15 Support requiredSystemFeatures 2015-06-15 16:33:50 +02:00
Eelco Dolstra 508ab7f8a2 Tweak build steps 2015-06-15 15:48:05 +02:00
Eelco Dolstra 21aaa0596b Fail builds with previously failed steps early 2015-06-15 15:31:42 +02:00
Eelco Dolstra c00bf7cd1a Check non-runnable steps for unsupported system type 2015-06-15 15:13:03 +02:00
Eelco Dolstra 5019fceb20 Add a error type for "unsupported system type" 2015-06-15 15:07:04 +02:00
Eelco Dolstra 541fbd62cc Immediately abort builds that require an unsupported system type 2015-06-15 14:51:49 +02:00
Eelco Dolstra f9cd5adae8 Queue monitor: Get only the fields we need 2015-06-11 18:09:50 +02:00