Commit graph

114 commits

Author SHA1 Message Date
Eelco Dolstra
62b53a0a47 Guard against concurrent invocations of hydra-queue-runner 2015-06-22 14:24:03 +02:00
Eelco Dolstra
fbd7c02217 Periodically dump/log status 2015-06-22 14:15:43 +02:00
Eelco Dolstra
4f4141e1db Add command ‘hydra-queue-runner --status’ to show current status 2015-06-22 14:06:44 +02:00
Eelco Dolstra
44a2b74f5a Keep track of the number of build steps that are being built
(As opposed to being in the closure copying stage.)
2015-06-22 11:23:00 +02:00
Eelco Dolstra
fed71d3fe9 Move "created" field into Step::State 2015-06-22 11:07:52 +02:00
Eelco Dolstra
90a08db241 hydra-queue-runner: Fix assertion failure 2015-06-22 10:59:07 +02:00
Eelco Dolstra
d744362e4a hydra-queue-runner: Fix segfault sorting machines by load
While sorting machines by load, the load of a machine
(machine->currentJobs) can be changed by other threads. If that
happens, the comparator is no longer a proper ordering, in which case
std::sort() can segfault. So we now make a copy of currentJobs before
sorting.
2015-06-21 16:21:42 +02:00
Eelco Dolstra
a0eff6fc15 Fix machine selection 2015-06-19 17:45:26 +02:00
Eelco Dolstra
81abb6e166 Improve parsing of hydra-build-products 2015-06-19 17:20:20 +02:00
Eelco Dolstra
e13477bdf2 Robustness 2015-06-19 16:35:49 +02:00
Eelco Dolstra
f196967c43 Don't create a propagated build step to the same build 2015-06-19 15:33:37 +02:00
Eelco Dolstra
7afc61691b Doh 2015-06-19 15:27:49 +02:00
Eelco Dolstra
133d298e26 Asynchronously compress build logs 2015-06-19 15:06:12 +02:00
Eelco Dolstra
8e408048e2 Create build step for non-top-level cached failures
This fixes the missing build step on failures like

  http://hydra.nixos.org/build/23222231
2015-06-19 11:33:15 +02:00
Eelco Dolstra
77c8bfd392 Improve logging for aborts 2015-06-19 10:37:22 +02:00
Eelco Dolstra
8db1ae2855 Less verbosity 2015-06-18 17:43:13 +02:00
Eelco Dolstra
89b629eeb1 Fix finishing steps that are not top-level of any build 2015-06-18 17:37:35 +02:00
Eelco Dolstra
9cdbff2fdf Handle concurrent finishing of the same build
There is a slight possibility that the queue monitor and a builder
thread simultaneously decide to mark a build as finished. That's fine,
as long as we ensure the DB update is idempotent (as ensured by doing
"update Builds set finished = 1 ... where finished = 0").
2015-06-18 17:12:51 +02:00
Eelco Dolstra
948473c909 Fix race between the queue monitor and the builder threads 2015-06-18 16:30:28 +02:00
Eelco Dolstra
9c03b11ca8 Simplify retry handling 2015-06-18 14:51:50 +02:00
Eelco Dolstra
e039f5f840 Create failed build steps for cached failures 2015-06-18 04:35:37 +02:00
Eelco Dolstra
92ea800cfb Set finishedInDB in a few more places 2015-06-18 04:19:21 +02:00
Eelco Dolstra
47367451c7 hydra-queue-runner: Set isCachedBuild 2015-06-18 03:28:58 +02:00
Eelco Dolstra
8257812d0a Acquire exclusive table lock earlier 2015-06-18 02:44:29 +02:00
Eelco Dolstra
69be3cfe93 hydra-queue-runner: Handle status queries on the main thread
Doing it on the queue monitor thread was problematic because
processing the queue can take a while.
2015-06-18 01:57:01 +02:00
Eelco Dolstra
a40ca6b76e hydra-queue-runner: Improve dispatcher
We now take the machine speed factor into account, just like
build-remote.pl.
2015-06-18 01:52:20 +02:00
Eelco Dolstra
3855131185 hydra-queue-runner: Improve SSH flags 2015-06-18 00:50:48 +02:00
Eelco Dolstra
f57d0b0c54 hydra-queue-runner: Maintain count of active build steps 2015-06-18 00:24:56 +02:00
Eelco Dolstra
59dae60558 hydra-queue-runner: More stats 2015-06-17 22:38:12 +02:00
Eelco Dolstra
ec8e8edc86 hydra-queue-runner: Handle $HYDRA_DBI 2015-06-17 22:11:01 +02:00
Eelco Dolstra
ce9e859a9c hydra-queue-runner: Implement --unlock 2015-06-17 21:35:20 +02:00
Eelco Dolstra
ca48818b30 Fix remote building 2015-06-17 17:28:59 +02:00
Eelco Dolstra
11be780948 Handle failure with output 2015-06-17 17:11:42 +02:00
Eelco Dolstra
b1a75c7f63 getQueuedBuilds(): Handle dependent builds first
If a build A depends on a derivation that is the top-level derivation
of some build B, then we should process B before A (meaning we
shouldn't make the derivation runnable before B has been
added). Otherwise, the derivation will be "accounted" to A rather than
B (so the build step will show up in the wrong build).
2015-06-17 14:46:02 +02:00
Eelco Dolstra
745efce828 hydra-queue-runner: Implement timeouts
Also, keep track of timeouts in the database as a distinct build
status.
2015-06-17 13:32:33 +02:00
Eelco Dolstra
2da4987bc2 Don't lock the CPU 2015-06-17 11:48:38 +02:00
Eelco Dolstra
b91a616520 Automatically retry aborted builds
Aborted builds are now put back on the runnable queue and retried
after a certain time interval (currently 60 seconds for the first
retry, then tripled on each subsequent retry).
2015-06-17 11:45:20 +02:00
Eelco Dolstra
e02654b3a0 Prefer cached failure over unsupported system type 2015-06-16 18:00:39 +02:00
Eelco Dolstra
42e7301c08 Add status dump facility
Doing

  $ psql hydra -c 'notify dump_status'

will cause hydra-queue-runner to dump some internal status info on
stderr.
2015-06-15 18:20:14 +02:00
Eelco Dolstra
dd104f14ea Make the queue monitor more robust, and better debug output 2015-06-15 16:54:52 +02:00
Eelco Dolstra
147eb4fd15 Support requiredSystemFeatures 2015-06-15 16:33:50 +02:00
Eelco Dolstra
508ab7f8a2 Tweak build steps 2015-06-15 15:48:05 +02:00
Eelco Dolstra
21aaa0596b Fail builds with previously failed steps early 2015-06-15 15:31:42 +02:00
Eelco Dolstra
c00bf7cd1a Check non-runnable steps for unsupported system type 2015-06-15 15:13:03 +02:00
Eelco Dolstra
5019fceb20 Add a error type for "unsupported system type" 2015-06-15 15:07:04 +02:00
Eelco Dolstra
541fbd62cc Immediately abort builds that require an unsupported system type 2015-06-15 14:51:49 +02:00
Eelco Dolstra
f9cd5adae8 Queue monitor: Get only the fields we need 2015-06-11 18:09:50 +02:00
Eelco Dolstra
c974fb893b Support cancelling builds 2015-06-11 18:07:45 +02:00
Eelco Dolstra
c08883966c Use PostgreSQL notifications for queue events
Hydra-queue-runner now no longer polls the queue periodically, but
instead sleeps until it receives a notification from PostgreSQL about
a change to the queue (build added, build cancelled or build
restarted).

Also, for the "build added" case, we now only check for builds with an
ID greater than the previous greatest ID. This is much more efficient
if the queue is large.
2015-06-11 17:41:59 +02:00
Eelco Dolstra
d72a88b562 Don't try to handle SIGINT
It just makes things unnecessarily complicated. We can just exit
without cleaning anything up, since the only thing to do is unmark
builds and build steps as busy. But we can do that by having systemd
call "hydra-queue-runner --unlock" from ExecStopPost.
2015-06-10 15:55:46 +02:00
Eelco Dolstra
a4fb93c119 Lock builds for a shorter amount of time 2015-06-10 15:36:21 +02:00
Eelco Dolstra
6d738a31bf Keep track of failed paths in the Hydra database
I.e. don't use Nix's failed paths feature anymore. Easier to keep
everything in one place.
2015-06-10 14:57:16 +02:00
Eelco Dolstra
c68036f8b0 Pass ssh key 2015-06-10 14:57:07 +02:00
Eelco Dolstra
7dd1f0097e Finish copyClosure 2015-06-09 16:03:41 +02:00
Eelco Dolstra
c93aa92563 Create BuildSteps race-free
If multiple threads create a step for the same build, they could get
the same "max(stepnr)" and allocate conflicting new step numbers. So
lock the BuildSteps table while doing this. We could use a different
isolation level, but this is easier.
2015-06-09 15:03:20 +02:00
Eelco Dolstra
61d4060522 Record the machine used for a build step 2015-06-09 14:57:49 +02:00
Eelco Dolstra
ca1fbdd058 Mark builds as busy 2015-06-09 14:31:43 +02:00
Eelco Dolstra
8b12ac1f6d Basic remote building
This removes the need for Nix's build-remote.pl.

Build logs are now written to $HYDRA_DATA/build-logs because
hydra-queue-runner doesn't have write permission to /nix/var/log.
2015-06-09 14:21:21 +02:00
Eelco Dolstra
3a6cb2f270 Implement a database connection pool 2015-05-29 20:55:13 +02:00
Eelco Dolstra
214b95706c On SIGINT, shut down the builder threads
Note that they don't get interrupted at the moment (so on SIGINT, any
running builds will need to finish first).
2015-05-29 20:02:15 +02:00
Eelco Dolstra
e778821940 Make concurrency more robust 2015-05-29 17:14:20 +02:00
Eelco Dolstra
8640e30787 Very basic multi-threaded queue runner 2015-05-29 01:31:12 +02:00
Eelco Dolstra
604fdb908f Pass null values to libpqxx properly 2015-05-28 19:06:17 +02:00
Eelco Dolstra
dc446c3980 Start of single-process hydra-queue-runner 2015-05-28 17:39:29 +02:00