hydra

Author	SHA1	Message	Date
Eelco Dolstra	33da40f272	Doh	2016-03-09 17:31:57 +01:00
Eelco Dolstra	4b9c76e502	hydra-queue-runner: Ensure regular status dumps	2016-03-09 17:11:34 +01:00
Eelco Dolstra	4151be7e69	Make the output size limit configurable The maximum output size per build step (as the sum of the NARs of each output) can be set via hydra.conf, e.g. max-output-size = 1000000000 The default is 2 GiB. Also refactored the build error / status handling a bit.	2016-03-09 17:00:09 +01:00
Eelco Dolstra	80ff78b1b6	Unify build and step status codes Also remove the obsolete status code 5 from the database.	2016-03-09 15:30:43 +01:00
Eelco Dolstra	9127f5bbc3	hydra-queue-runner: Limit memory usage When using a binary cache store, the queue runner receives NARs from the build machines, compresses them, and uploads them to the cache. However, keeping multiple large NARs in memory can cause the queue runner to run out of memory. This can happen for instance when it's processing multiple ISO images concurrently. The fix is to use a TokenServer to prevent the builder threads to store more than a certain total size of NARs concurrently (at the moment, this is hard-coded at 4 GiB). Builder threads that cause the limit to be exceeded will block until other threads have finished. The 4 GiB limit does not include certain other allocations, such as for xz compression or for FSAccessor::readFile(). But since these are unlikely to be more than the size of the NARs and hydra.nixos.org has 32 GiB RAM, it should be fine.	2016-03-09 14:30:13 +01:00
Eelco Dolstra	b77a43b83d	Get rid of "will retry" messages after "maybe cancelling..."	2016-03-08 13:09:39 +01:00
Eelco Dolstra	718fef29ef	Keep track of time required to load builds	2016-03-08 13:09:29 +01:00
Eelco Dolstra	b98a061c24	Add some instrumentation to keep track of dispatcher cost	2016-03-02 14:18:39 +01:00
Eelco Dolstra	6beee0ab49	Fix segfault sorting runnable steps Same problem as `d744362e4a`. at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/predefined_ops.h:166 __last@entry=..., __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:1827 __comp=...) at /nix/store/ksvsbr7pg4z69bv6fbbc8h7x7rm2104m-gcc-4.9.3/include/c++/4.9.3/bits/stl_algo.h:4717	2016-03-02 13:59:24 +01:00
Eelco Dolstra	7cd08c7c46	Warn if PostgreSQL appears stalled	2016-02-29 15:10:30 +01:00
Eelco Dolstra	6d741d2ffa	Prevent download of NARs we just uploaded	2016-02-26 15:21:44 +01:00
Eelco Dolstra	8321a3eb27	Sync with Nix	2016-02-24 14:04:31 +01:00
Eelco Dolstra	6c3ae36648	hydra-queue-runner: Get store mode configuration from hydra.conf To use the local Nix store (default): store_mode = direct To use a local binary cache: store_mode = local-binary-cache binary_cache_dir = /var/lib/hydra/binary-cache To use an S3 bucket: store_mode = s3-binary-cache binary_cache_s3_bucket = my-nix-bucket Also, respect binary_cache_{secret,public}_key_file for signing the binary cache.	2016-02-22 17:23:06 +01:00
Eelco Dolstra	88a05763cc	Pool local store connections	2016-02-20 00:04:08 +01:00
Eelco Dolstra	dc4a00347d	Use a single BinaryCacheStore for all threads This will make it easier to do caching / keep stats. Also, we won't have S3Client's connection pooling if we create multiple S3Client instances.	2016-02-18 17:31:19 +01:00
Eelco Dolstra	ce5790285a	Merge remote-tracking branch 'origin/master' into binary-cache	2016-02-17 11:54:59 +01:00
Eelco Dolstra	d7a123fcd4	Keep track of the time we spend copying to/from build machines	2016-02-17 10:30:23 +01:00
Eelco Dolstra	2d0dd7fb49	hydra-queue-runner: Write directly to a binary cache	2016-02-15 21:10:29 +01:00
Eelco Dolstra	92d8b59361	Process Nix API changes	2016-02-11 15:59:47 +01:00
Eelco Dolstra	c087472c71	Remove superfluous "has" function	2015-11-02 14:29:12 +01:00
Eelco Dolstra	53c80d9526	getQueuedBuilds(): Periodically stop to handle priority bumps Previously, priority bumps could take a long time to get noticed if getQueuedBuilds() was busy processing zillions of queue additions. (This was made worse by the reintroduction of substitute checking.)	2015-10-22 17:00:46 +02:00
Eelco Dolstra	71bf7e02d5	Use nix::willBuildLocally()	2015-10-21 15:44:29 +02:00
Eelco Dolstra	8e8e31ce86	Re-implement log size limits The old queue runner already had this. However, we now store "log limit exceeded" as a separate status code in the database.	2015-10-06 17:35:08 +02:00
Eelco Dolstra	82504fe010	hydra-queue-runner: Use substitutes This allows Hydra to use binaries from available binary caches. It makes the queue monitor thread quite a bit slower, so if you don't want to use binary caches, it's better to add "--option build-use-substitutes false" to the hydra-queue-runner invocation. Fixed #243.	2015-10-05 14:57:44 +02:00
Eelco Dolstra	7e954aff03	Keep machine stats even when a machine is removed from the machines file This is important for the Hydra provisioner, since it needs to be able to see whether a disabled machine still has jobs running on it.	2015-09-02 13:31:47 +02:00
Eelco Dolstra	2a7fbd57cc	Allow the machines file to specify host public keys It's easier for the Hydra provisioner to put host public keys in the machines file than to separately manage the known_hosts file (especially when the provisioner runs on a different machine).	2015-08-26 13:43:02 +02:00
Eelco Dolstra	7aa52517e9	Support multiple machines files This is primarily useful for the Hydra provisioner, which can write its machines to another file than /etc/nix/machines.	2015-08-25 15:34:53 +02:00
Eelco Dolstra	092d60735b	Keep track of wait time per system type I.e., how much time the currently runnable steps per system type have been waiting. This is useful for deciding whether to provision more machines.	2015-08-17 15:45:44 +02:00
Eelco Dolstra	ea1eb2e3fb	Keep track of requiredSystemFeatures in the machine stats For example, steps that require the "kvm" feature may require a different kind of machine to be provisioned. This can also be used to require performance-sensitive tests to run on a particular kind of machine, e.g., by setting requiredSystemFeatures to something like "ec2-i2.8xlarge".	2015-08-17 14:37:57 +02:00
Eelco Dolstra	d571e44b86	Keep stats for the Hydra auto scaler "hydra-queue-runner --status" now prints how many runnable and running build steps exist for each machine type. This allows additional machines to be provisioned based on the Hydra load.	2015-08-17 13:50:41 +02:00
Eelco Dolstra	d4759c1da2	hydra-queue-runner: Detect changes to the scheduling shares	2015-08-12 13:17:56 +02:00
Eelco Dolstra	576dc0c120	For completeness, re-implement meta.schedulingPriority	2015-08-12 12:05:43 +02:00
Eelco Dolstra	97f11baa8d	Revive jobset scheduling (I.e. taking the jobset scheduling share into account.)	2015-08-11 01:31:56 +02:00
Eelco Dolstra	eb13007fe6	Allow build to be bumped to the front of the queue via the web interface Builds now have a "Bump up" action. This will cause the queue runner to prioritise the steps of the build above all other steps.	2015-08-10 16:19:47 +02:00
Eelco Dolstra	27182c7c1d	Start steps in order of ascending build ID	2015-08-10 16:19:47 +02:00
Eelco Dolstra	593850b956	Fix potential race in dispatcher wakeup	2015-08-10 12:54:55 +02:00
Eelco Dolstra	6a1c950e94	Unindent	2015-08-10 11:33:22 +02:00
Eelco Dolstra	4d26546d3c	Add support for tracking custom metrics Builds can now emit metrics that Hydra will store in its database and render as time series via flot charts. Typical applications are to keep track of performance indicators, coverage percentages, artifact sizes, and so on. For example, a coverage build can emit the coverage percentage as follows: echo "lineCoverage $pct %" > $out/nix-support/hydra-metrics Graphs of all metrics for a job can be seen at http://.../job/<project>/<jobset>/<job>#tabs-charts Specific metrics are also visible at http://.../job/<project>/<jobset>/<job>/metric/<metric> The latter URL also allows getting the data in JSON format (e.g. via "curl -H 'Accept: application/json'").	2015-07-31 00:57:30 +02:00
Eelco Dolstra	c18fb0ad74	Temporarily disable machines after a connection failure	2015-07-21 15:58:47 +02:00
Eelco Dolstra	7e026d35f7	Split hydra-queue-runner.cc more	2015-07-21 15:14:17 +02:00
Eelco Dolstra	5370be9f52	hydra-queue-runner: Use cmdBuildDerivation See `1511aa9f48` and `eda2f36c2a`.	2015-07-21 01:54:24 +02:00
Eelco Dolstra	3ded87329d	Keep track of how many threads are waiting	2015-07-10 19:10:14 +02:00
Eelco Dolstra	35b7c4f82b	Allow only 1 thread to send a closure to a given machine at the same time This prevents a race where multiple threads see that machine X is missing path P, and start sending it concurrently. Nix handles this correctly, but it's still wasteful (especially for the case where P == GHC). A more refined scheme would be to have per machine, per path locks.	2015-07-07 14:06:48 +02:00
Eelco Dolstra	16696a4aee	Namespace cleanup	2015-07-07 10:29:43 +02:00
Eelco Dolstra	63745b8e25	Move buildRemote() into State	2015-07-07 10:25:33 +02:00
Eelco Dolstra	df29527531	Refactor	2015-07-07 10:17:21 +02:00

46 commits