Commit graph

2252 commits

Author SHA1 Message Date
Eelco Dolstra
520c8a5826 Use faster query to determine number of running builds
The previous query

  select count(*) from builds b left join buildsteps s on s.build = b.id where busy = 1 and finished = 0

is suddenly taking several minutes. Probably PostgreSQL decided to use
a suboptimal query plan.
2016-03-16 13:41:43 +01:00
Eelco Dolstra
405a43c171 Queue summary: Make rows clickable 2016-03-10 16:48:06 +01:00
Eelco Dolstra
5535bc28ca Tweak 2016-03-10 16:46:15 +01:00
Eelco Dolstra
60e7930d2b Bump memory limit a bit 2016-03-10 16:46:01 +01:00
Eelco Dolstra
75e7b35477 Fix retry of transient failures 2016-03-10 16:44:26 +01:00
Eelco Dolstra
de71d5b622 Fix showing machine name for aborted build steps 2016-03-10 16:42:36 +01:00
Eelco Dolstra
33da40f272 Doh 2016-03-09 17:31:57 +01:00
Eelco Dolstra
4b9c76e502 hydra-queue-runner: Ensure regular status dumps 2016-03-09 17:11:34 +01:00
Eelco Dolstra
4151be7e69 Make the output size limit configurable
The maximum output size per build step (as the sum of the NARs of each
output) can be set via hydra.conf, e.g.

  max-output-size = 1000000000

The default is 2 GiB.

Also refactored the build error / status handling a bit.
2016-03-09 17:00:09 +01:00
Eelco Dolstra
dc790c5f7e Fix bad format string 2016-03-09 16:59:35 +01:00
Eelco Dolstra
80ff78b1b6 Unify build and step status codes
Also remove the obsolete status code 5 from the database.
2016-03-09 15:30:43 +01:00
Eelco Dolstra
9127f5bbc3 hydra-queue-runner: Limit memory usage
When using a binary cache store, the queue runner receives NARs from
the build machines, compresses them, and uploads them to the
cache. However, keeping multiple large NARs in memory can cause the
queue runner to run out of memory. This can happen for instance when
it's processing multiple ISO images concurrently.

The fix is to use a TokenServer to prevent the builder threads to
store more than a certain total size of NARs concurrently (at the
moment, this is hard-coded at 4 GiB). Builder threads that cause the
limit to be exceeded will block until other threads have finished.

The 4 GiB limit does not include certain other allocations, such as
for xz compression or for FSAccessor::readFile(). But since these are
unlikely to be more than the size of the NARs and hydra.nixos.org has
32 GiB RAM, it should be fine.
2016-03-09 14:30:13 +01:00
Eelco Dolstra
49a4639377 Add a more concise queue page
The old page didn't scale very well if you have 150K builds in the
queue, in fact it tended to make browsers hang. The new one just
shows, for each jobset, the number of queued builds. The actual builds
can be seen by going to the corresponding jobset page and looking at
the evals.
2016-03-08 19:44:51 +01:00
Eelco Dolstra
b77a43b83d Get rid of "will retry" messages after "maybe cancelling..." 2016-03-08 13:09:39 +01:00
Eelco Dolstra
718fef29ef Keep track of time required to load builds 2016-03-08 13:09:29 +01:00
Eelco Dolstra
2feb17c681 Some more logging 2016-03-08 13:08:07 +01:00
Eelco Dolstra
45b237453a hydra-queue-runner: Recycle finishedDrvs
This should prevent the queue monitor thread from looking up the same
derivations over and over again.
2016-03-08 11:52:13 +01:00
Eelco Dolstra
2ab8e9a1e0 hydra-queue-runner: Fix handling of missing derivations
This barfed with 'queue monitor: ERROR: column "errormsg" of relation
"builds" does not exist' due to the removal of the errorMsg column.
2016-03-07 19:05:24 +01:00
Eelco Dolstra
e7ce225558 Fix build 2016-03-04 17:51:32 +01:00
Eelco Dolstra
76104accda Return unique store paths 2016-03-03 11:32:30 +01:00
Eelco Dolstra
86a2d6471c Fix a boost format string abort 2016-03-02 20:06:48 +01:00
Eelco Dolstra
e7655fdcbc Fix latest-finished 2016-03-02 18:06:20 +01:00
Eelco Dolstra
232ca8fea2 Fix build 2016-03-02 17:05:07 +01:00
Eelco Dolstra
e45bbfbef0 Fix .nixpkg channel uri
Fixes #274.
2016-03-02 15:38:40 +01:00
Eelco Dolstra
8b4f90b0d4 .nixpkgs: Drop obsolete manifest URI 2016-03-02 15:24:23 +01:00
Eelco Dolstra
ec82bc2517 Add /eval/NNN/store-paths action to return store paths in an eval
Needed by the NixOS channel scripts since we can no longer get a
MANIFEST from Hydra.
2016-03-02 15:17:22 +01:00
Eelco Dolstra
a74251af2b Disable channels on binary cached based Hydra instances 2016-03-02 15:08:53 +01:00
Eelco Dolstra
f09b92e289 Remove another obsolete SSL variable 2016-03-02 15:03:54 +01:00
Eelco Dolstra
2d6b585cb3 Merge branch 'slack-plugin' of https://github.com/shlevy/hydra 2016-03-02 15:03:03 +01:00
Shea Levy
6eb1c5bf19 Remove PERL_LWP_SSL_CA_FILE.
Real fix is NixOS/nixpkgs@bd7f379a3f
2016-03-02 09:02:48 -05:00
Eelco Dolstra
b98a061c24 Add some instrumentation to keep track of dispatcher cost 2016-03-02 14:18:39 +01:00
Eelco Dolstra
6beee0ab49 Fix segfault sorting runnable steps
Same problem as d744362e4a.

    at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/predefined_ops.h:166
    __last@entry=..., __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:1827
    __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:4717
2016-03-02 13:59:24 +01:00
Shea Levy
0f5937503e SlackNotification: Use bigger images 2016-03-01 11:25:18 -05:00
Shea Levy
006ac1fc03 Add slack plugin.
Respects <slack> blocks in the hydra config, with attributes:

* jobs: a regexp matching the job name (in the format project:jobset:job)
* url: The URL to a slack incoming webhook
* force: If true, always send messages. Otherwise, only when the build status changes

Multiple <slack> blocks are allowed
2016-02-29 14:48:36 -05:00
Eelco Dolstra
bc958c508b Merge branch 'binary-cache' 2016-02-29 18:29:07 +01:00
Eelco Dolstra
7cd08c7c46 Warn if PostgreSQL appears stalled 2016-02-29 15:10:30 +01:00
Eelco Dolstra
922dc541c2 Add log message 2016-02-29 11:58:06 +01:00
Eelco Dolstra
ad035b5227 hydra-queue-runner: Enable core dumps 2016-02-28 14:09:04 +01:00
Eelco Dolstra
610a8d67ae Better AWS error messages 2016-02-26 22:40:27 +01:00
Eelco Dolstra
1693354506 Remove unnecessary call to hydra-queue-runner --unlock 2016-02-26 21:45:59 +01:00
Eelco Dolstra
1a055e7e9e Reduce severity level of some message 2016-02-26 21:31:08 +01:00
Eelco Dolstra
6bb860fd6e Add FIXME 2016-02-26 21:15:05 +01:00
Eelco Dolstra
e8cdfe5171 hydra-server: Don't barf if the binary cache public key can't be read 2016-02-26 21:14:40 +01:00
Eelco Dolstra
53ca41ef9f Use US standard S3 region 2016-02-26 20:57:47 +01:00
Eelco Dolstra
c635f5d0ea Fix Makefile.am 2016-02-26 19:54:55 +01:00
Eelco Dolstra
b1ce76c2b4 Fix test
nix-support/failed is supposed to be a file, not a directory.
2016-02-26 19:54:32 +01:00
Eelco Dolstra
07e5fc5618 Hackery to make downloads work when using a binary cache 2016-02-26 17:28:26 +01:00
Eelco Dolstra
b00bdefa98 Fix hydra-server signing 2016-02-26 17:28:16 +01:00
Eelco Dolstra
9de336de7c Proxy local binary caches via hydra-server 2016-02-26 17:27:30 +01:00
Eelco Dolstra
b9afaadfb3 Keep better bytesReceived/bytesSent stats 2016-02-26 16:17:05 +01:00