Commit graph

2236 commits

Author SHA1 Message Date
Eelco Dolstra 077ed3f571 Periodically clear orphaned build steps
These are build steps that remain "busy" in the database even though
they have finished, because they couldn't be updated (e.g. due to a
PostgreSQL connection problem). To prevent them from showing up as
busy in the "Machine status" page, we now periodically purge them.
2016-04-13 16:30:52 +02:00
Eelco Dolstra f3f661bac1 Reuse build products / metrics stored in the database
Previously, if the queue monitor thread encounters a build that Hydra
has previously built, it downloaded the output paths from the binary
cache, just to determine the build products and metrics. This is very
inefficient. In particular, when doing something like merging
nixpkgs:staging into nixpkgs:master, the queue monitor thread will be
locked up for a long time fetching files from S3, causing the build
farm to be mostly idle.

Of course this is entirely unnecessary, since the build
products/metrics are already in the Hydra database. So now we just
look up a previous build with the same output path, and copy the
products/metrics.
2016-04-13 16:30:52 +02:00
Eelco Dolstra a7755678fe Drop unused BuildProducts.description column 2016-04-13 16:30:52 +02:00
Eelco Dolstra 8c7edb1005 Fix narrowing conversion 2016-04-13 16:30:52 +02:00
Eelco Dolstra 00c78440b1 Disambiguate "marking build as succeeded" message 2016-04-13 16:30:52 +02:00
Eelco Dolstra ad834343b5 Fix build against current Nix master 2016-04-13 16:30:52 +02:00
Eelco Dolstra 49f94bac3a Merge pull request #291 from elitak/typo
typo
2016-04-06 21:07:21 +02:00
Eric Litak 17035661f2 typo 2016-04-06 11:28:32 -07:00
Eelco Dolstra 68b636c1f2 Merge pull request #290 from svanderburg/master
Fix problem with delegating builds to localhost due to nix-store not being in the PATH
2016-04-06 16:21:07 +02:00
Sander van der Burg cbd2e3a50d Fix problem with delegating builds to localhost due to nix-store not being in the PATH 2016-04-06 14:16:04 +00:00
Eelco Dolstra 74dfcc84e9 Make NIX_REMOTE_SYSTEMS configurable 2016-03-25 15:41:38 +01:00
Eelco Dolstra ed88bbaac0 Set Vary to Accept
Otherwise, the browser may mix up HTML and JSON responses if it has
requested both. For example, hitting the back button to return to a
job metric page will show a JSON response, because that was the last
thing the browser fetched for that URL.

This requires Catalyst::Action::Rest >= 1.20.
2016-03-25 14:48:12 +01:00
Eelco Dolstra 32adc53070 Add tooltips to metrics showing the exact value of the data point 2016-03-25 14:32:36 +01:00
Eelco Dolstra 3e2911803d Add link to metrics 2016-03-25 13:57:17 +01:00
Eelco Dolstra dab16fb26b Lazy load the metrics tab 2016-03-25 13:49:06 +01:00
Eelco Dolstra 7a72f64e5e Move chart code to common.js 2016-03-25 13:33:10 +01:00
Eelco Dolstra dc2010eafc Fix rendering of metrics with dots in their name 2016-03-25 13:24:43 +01:00
Eelco Dolstra ef63dd77e3 Fix metric alignment 2016-03-25 12:08:18 +01:00
Eelco Dolstra 759bd38ef2 Sort metrics by name 2016-03-25 11:56:25 +01:00
Eelco Dolstra 32fa392146 Fix hydra-queue-runner PATH 2016-03-23 12:35:55 +01:00
Eelco Dolstra aa4c1fb1ab Fix version 2016-03-22 17:26:50 +01:00
Eelco Dolstra 6fc4dc4e27 /queue-summary: Show number of queued builds by system type 2016-03-22 17:03:26 +01:00
Eelco Dolstra aba2356932 Restore path in nix-shell 2016-03-22 16:59:05 +01:00
Eelco Dolstra ddc9f3cc6a Temporarily disable machines on any exception, not just connection failures 2016-03-22 16:54:40 +01:00
Eelco Dolstra 0aecd65e59 /queue-runner-status: Include info about temporarily disabled machines 2016-03-22 16:54:06 +01:00
Eelco Dolstra 1332463b02 Don't wrap C++ programs 2016-03-22 13:35:09 +01:00
Eelco Dolstra 4dfbe5c642 Don't pollute the source directory 2016-03-22 13:19:00 +01:00
Eelco Dolstra e624652dd8 Use patched aws-sdk-cpp 2016-03-22 13:11:30 +01:00
Eelco Dolstra a727643286 inNixShell considered harmful 2016-03-22 13:10:37 +01:00
Eelco Dolstra 74426e6820 Simplify running nix-shell
This also removes building a separate source tarball or building a PDF
manual since it's unlikely anybody cares.
2016-03-22 12:53:28 +01:00
Eelco Dolstra ac23bd1539 Revert "Apply IndexBuildsOnJobFinishedId to unfinished builds only"
This reverts commit 1de5ce7a0e.
2016-03-16 17:04:20 +01:00
Eelco Dolstra 7d8bf1b0f2 Shorten host names 2016-03-16 15:23:56 +01:00
Eelco Dolstra d5cffd4bc7 Make "Running builds" and "Machine status" pages faster 2016-03-16 15:19:18 +01:00
Eelco Dolstra 1de5ce7a0e Apply IndexBuildsOnJobFinishedId to unfinished builds only 2016-03-16 15:17:10 +01:00
Eelco Dolstra 520c8a5826 Use faster query to determine number of running builds
The previous query

  select count(*) from builds b left join buildsteps s on s.build = b.id where busy = 1 and finished = 0

is suddenly taking several minutes. Probably PostgreSQL decided to use
a suboptimal query plan.
2016-03-16 13:41:43 +01:00
Eelco Dolstra 405a43c171 Queue summary: Make rows clickable 2016-03-10 16:48:06 +01:00
Eelco Dolstra 5535bc28ca Tweak 2016-03-10 16:46:15 +01:00
Eelco Dolstra 60e7930d2b Bump memory limit a bit 2016-03-10 16:46:01 +01:00
Eelco Dolstra 75e7b35477 Fix retry of transient failures 2016-03-10 16:44:26 +01:00
Eelco Dolstra de71d5b622 Fix showing machine name for aborted build steps 2016-03-10 16:42:36 +01:00
Eelco Dolstra 33da40f272 Doh 2016-03-09 17:31:57 +01:00
Eelco Dolstra 4b9c76e502 hydra-queue-runner: Ensure regular status dumps 2016-03-09 17:11:34 +01:00
Eelco Dolstra 4151be7e69 Make the output size limit configurable
The maximum output size per build step (as the sum of the NARs of each
output) can be set via hydra.conf, e.g.

  max-output-size = 1000000000

The default is 2 GiB.

Also refactored the build error / status handling a bit.
2016-03-09 17:00:09 +01:00
Eelco Dolstra dc790c5f7e Fix bad format string 2016-03-09 16:59:35 +01:00
Eelco Dolstra 80ff78b1b6 Unify build and step status codes
Also remove the obsolete status code 5 from the database.
2016-03-09 15:30:43 +01:00
Eelco Dolstra 9127f5bbc3 hydra-queue-runner: Limit memory usage
When using a binary cache store, the queue runner receives NARs from
the build machines, compresses them, and uploads them to the
cache. However, keeping multiple large NARs in memory can cause the
queue runner to run out of memory. This can happen for instance when
it's processing multiple ISO images concurrently.

The fix is to use a TokenServer to prevent the builder threads to
store more than a certain total size of NARs concurrently (at the
moment, this is hard-coded at 4 GiB). Builder threads that cause the
limit to be exceeded will block until other threads have finished.

The 4 GiB limit does not include certain other allocations, such as
for xz compression or for FSAccessor::readFile(). But since these are
unlikely to be more than the size of the NARs and hydra.nixos.org has
32 GiB RAM, it should be fine.
2016-03-09 14:30:13 +01:00
Eelco Dolstra 49a4639377 Add a more concise queue page
The old page didn't scale very well if you have 150K builds in the
queue, in fact it tended to make browsers hang. The new one just
shows, for each jobset, the number of queued builds. The actual builds
can be seen by going to the corresponding jobset page and looking at
the evals.
2016-03-08 19:44:51 +01:00
Eelco Dolstra b77a43b83d Get rid of "will retry" messages after "maybe cancelling..." 2016-03-08 13:09:39 +01:00
Eelco Dolstra 718fef29ef Keep track of time required to load builds 2016-03-08 13:09:29 +01:00
Eelco Dolstra 2feb17c681 Some more logging 2016-03-08 13:08:07 +01:00