Each jobset now has a "scheduling share" that determines how much of
the build farm's time it is entitled to. For instance, if a jobset
has 100 shares and the total number of shares of all jobsets is 1000,
it's entitled to 10% of the build farm's time. When there is a free
build slot for a given system type, the queue runner will select the
jobset that is furthest below its scheduling share over a certain time
window (currently, the last day). Withing that jobset, it will pick
the build with the highest priority.
So meta.schedulingPriority now only determines the order of builds
within a jobset, not between jobsets. This makes it much easier to
prioritise one jobset over another (e.g. nixpkgs:trunk over
nixpkgs:stdenv).
For presentation purposes, we need to know what builds are part of an
aggregate build. So at evaluation time, look at the "members"
attribute, find the corresponding builds in the eval, and create a
mapping in the AggregateMembers table.
The NrBuilds table tracks the value of ‘select count(*) from Builds
where finished = 0’, keeping it up to date via a trigger. This is
necessary to make the /all page fast, since otherwise it needs to do a
sequential scan on the Builds table.
The catalyst-action-rest branch from shlevy/hydra was an exploration of
using Catalyst::Action::REST to create a JSON API for hydra. This commit
merges in the best bits from that experiment, with the goal that further
API endpoints can be added incrementally.
In addition to migrating more endpoints, there is potential for
improvement in what's already been done:
* The web interface can be updated to use the same non-GET endpoints as
the JSON interface (using x-tunneled-method) instead of having a
separate endpoint
* The web rendering should use the $c->stash->{resource} data structure
where applicable rather than putting the same data in two places in
the stash
* Which columns to render for each endpoint is a completely debatable
question
* Hydra::Component::ToJSON should turn has_many relations that have
strings as their primary keys into objects instead of arrays
FixesNixOS/hydra#98
Signed-off-by: Shea Levy <shea@shealevy.com>
Previously, for scheduled builds, "timestamp" contained the time the
build was added to the queue, while for finished builds, it was the
time the build finished. Now it's always the former.
This allows checking a jobset (say) at most once a day. It's also
possible to disable polling by setting the interval to 0. This is
useful for jobsets that use push notification or are manually
evaluated.
This reverts commit 71d020735b.
Unfortunately there are still some cases where we need to set Hydra's
concurrency separately. (Ideally, Hydra would start *all* queued
builds in parallel and let Nix take care of everything...)
External machines can now notify Hydra that it should check a
repository by sending a GET or PUSH request to /api/push, providing a
list of jobsets to be checked and/or a list of repository URLs. In
the latter case, all jobsets that have any of the specified
repositories as an input will be checked.
For instance, you can configure GitHub or BitBucket to send a request
to the URL
http://hydra.example.org/api/push?repos=git://github.com/NixOS/nixpkgs.git
to trigger evaluation of all jobsets that have
git://github.com/NixOS/nixpkgs.git as an input, or to the URL
http://hydra.example.org/api/push?jobsets=patchelf:trunk,nixpkgs:trunk
to trigger evaluation of just the specified jobsets.
It's pointless to store these, since Nix knows where the logs are.
Also handle (in fact require) Nix's new log storage scheme. Also some
cleanups in the build page.
* Don't use isCurrent anymore; instead look up builds in the previous
jobset evaluation. (The isCurrent field is still maintained because
it's still used in some other places.)
* To determine whether to perform an evaluation, compare the hash of
the current inputs with the inputs of the previous jobset
evaluation, rather than checking if there was ever an evaluation
with those inputs. This way, if the inputs of an evaluation change
back to a previous state, we get a new jobset evaluation in the
database (and thus the latest jobset evaluation correctly represents
the latest state of the jobset).
* Improve performance by removing some unnecessary operations and
adding an index.
The hydra-update-gc-roots script is taking around 95 minutes on our
Hydra instance (though a lot of that is I/O wait). This patch
significantly reduces the number of database queries. In particular,
the N most recent successful builds for each job in a jobset are now
determined in a single query. Also, it removes the calls to
readlink().
Prepared statements are sometimes much slower than unprepared
statements, because the planner doesn't have access to the query
parameters. This is the case for the active build steps query (in
/status), where a prepared statement is three orders of magnitude
slower. So disable the use of prepared statements in this case.
(Since the query parameters are constant here, it would be nicer if we
could tell DBIx::Class to prepare a statement with those parameters
fixed. But I don't know an easy way to do so.)
For schema upgrades, hydra-init executes the files
src/sql/upgrade-<N>.sql, each of which upgrades the schema from
version N-1 to N. The upgrades are wrapped in a transaction.
The singleton table SchemaVersion contains the current version
of the Hydra database schema. This can be used to upgrade the
schema on the fly.
Also reran the DBIx::Class schema loader.
* Store the system type in the BuildSteps table.
* Don't query the queue size when serving static pages. This prevents
two unnecessary database queries per request.
recording the builds that are part of a jobset evaluation. We need
this to be able to answer queries such as "return the latest NixOS
ISO for which the installation test succeeded". This wasn't previously
possible because the database didn't record which builds of (say)
the `isoMinimal' job and the `tests.installer.simple' job came from
the same evaluation of the nixos:trunk jobset.
Keeping a record of evaluations is also useful for logging purposes.
faster, from about 4.5s to 1.0s for the global "latest" channel.
Note that the query is only fast if the "IndexBuildsOnJob" and
"IndexBuildsOnJobAndIsCurrent" indices are dropped - if they exist,
PostgreSQL will use those instead of the more efficient
"IndexBuildsOnJobFinishedId" index. Looks like a bug in the planner
to me...
releases as a dynamic view on the database was misguided, since
doing thing like adding a new job to a release set will invalidate
all old releases. So we rename release sets to views, and we'll
reintroduce releases as separate, static entities in the database.
the derivations that the jobset currently contains. This is
necessary to allow the "latest" channel to contain the correct
builds when the sources of a jobset are reverted.