Pick the jobset that has used the smallest fraction of its share,
rather than the jobset furthest below its share in absolute terms.
This gives jobsets with a small share a quicker start (but they
will also run out of their share quicker).
Each jobset now has a "scheduling share" that determines how much of
the build farm's time it is entitled to. For instance, if a jobset
has 100 shares and the total number of shares of all jobsets is 1000,
it's entitled to 10% of the build farm's time. When there is a free
build slot for a given system type, the queue runner will select the
jobset that is furthest below its scheduling share over a certain time
window (currently, the last day). Withing that jobset, it will pick
the build with the highest priority.
So meta.schedulingPriority now only determines the order of builds
within a jobset, not between jobsets. This makes it much easier to
prioritise one jobset over another (e.g. nixpkgs:trunk over
nixpkgs:stdenv).
In your hydra config, you can add an arbitrary number of <s3config>
sections, with the following options:
* name (required): Bucket name
* jobs (required): A regex to match job names (in project:jobset:job
format) that should be backed up to this bucket
* compression_type: bzip2 (default), xz, or none
* prefix: String to prepend to all hydra-created s3 keys (if this is
meant to represent a directory, you should include the trailing slash,
e.g. "cache/"). Default "".
After each build with an output (i.e. successful or failed-with-output
builds), the output path and its closure are uploaded to the bucket as
.nar files, with corresponding .narinfos to enable use as a binary
cache.
This plugin requires that s3 credentials be available. It uses
Net::Amazon::S3, which as of this commit the nixpkgs version can
retrieve s3 credentials from the AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY environment variables, or from ec2 instance
metadata when using an IAM role.
This commit also adds a hydra-s3-backup-collect-garbage program, which
uses hydra's gc roots directory to determine which paths are live, and
then deletes all files except nix-cache-info and any .nar or .narinfo
files corresponding to live paths. hydra-s3-backup-collect-garbage
respects the prefix configuration option, so it won't delete anything
outside of the hierarchy you give it, and it has the same credential
requirements as the plugin. Probably a timer unit running the garbage
collection periodically should be added to hydra-module.nix
Note that two of the added tests fail, due to a bug in the interaction
between Net::Amazon::S3 and fake-s3. Those behaviors work against real
s3 though, so I'm committing this even with the broken tests.
Signed-off-by: Shea Levy <shea@shealevy.com>
We now keep *all* unfinished evaluations of a jobset, in addition to
the <keepnr> most recent finished evaluations.
The main motivation is to ensure that mirror-{nixos,nixpkgs} work
properly: if building an evaluation takes too long, some of its builds
may already have been garbage-collected by the time the others finish.
We had Postgres barfing with this error:
DBIx::Class::Storage::DBI::_dbh_execute(): DBI Exception: DBD::Pg::st execute failed: ERROR: stack depth limit exceeded
because the ‘drvpath => [ @dependentDrvs ]’ in failDependents can
cause a query of unbounded size. (In this specific case there was a
failure of Bison, which has > 10000 dependent derivations.) So now we
just get all scheduled builds from the DB.
Due to the fixed-output derivation hashing scheme, there can be
multiple derivations of the same output path. But build logs are
indexed by derivation path. Thus, we may not be able to find the
log of a build or build step using its derivation. So as a fallback,
Hydra now looks for other derivations with the same output paths.