bc4b96d053
Looking at AWS' Performance Insights for a Hydra instance, I found the hydra-queue-runner's query: select id, buildStatus, releaseName, closureSize, size from Builds b join BuildOutputs o on b.id = o.build where finished = ? and (buildStatus = ? or buildStatus = ?) and path = $1 was the slowest query by at least 10x. Running an explain on this showed why: hydra=> explain select id, buildStatus, releaseName, closureSize, size from Builds b join BuildOutputs o on b.id = o.build where finished = 1 and (buildStatus = 0 or buildStatus = 6) and path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'; QUERY PLAN ------------------------------------------------------------------------ Gather (cost=1000.43..33718.98 rows=2 width=56) Workers Planned: 2 -> Nested Loop (cost=0.43..32718.78 rows=1 width=56) -> Parallel Seq Scan on buildoutputs o (cost=0.00..32710.32 rows=1 width=4) Filter: (path = '/nix/store/s93kh...snip...'::text) -> Index Scan using indexbuildsonjobsetidfinishedid on builds b (cost=0.43..8.45 rows=1 width=56) Index Cond: ((id = o.build) AND (finished = 1)) Filter: ((buildstatus = 0) OR (buildstatus = 6)) (8 rows) A paralell sequential scan is definitely better than a sequential scan, but the cost ranging from 0 to 32710 is not great. Looking at the table, I saw the `path` column is completely unindex: hydra=> \d buildoutputs Table "public.buildoutputs" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- build | integer | | not null | name | text | | not null | path | text | | not null | Indexes: "buildoutputs_pkey" PRIMARY KEY, btree (build, name) Foreign-key constraints: "buildoutputs_build_fkey" FOREIGN KEY (build) REFERENCES builds(id) ON DELETE CASCADE Since we always do exact matches on the path and don't care about ordering, and since the path column is very high cardinality a `hash` index is a good candidate. Note that I did test a btree index and it performed similarly well, but slightly worse. After creating the index (this took about 10 seconds) on a test database: create index IndexBuildOutputsPath on BuildOutputs using hash(path); We get a *significantly* reduced cost: hydra=> explain select id, buildStatus, releaseName, closureSize, size hydra-> from Builds b join BuildOutputs o on b.id = o.build where hydra-> finished = 1 and (buildStatus = 0 or buildStatus = 6) and hydra-> path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'; QUERY PLAN ------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.43..41.41 rows=2 width=56) -> Index Scan using buildoutputs_path_hash on buildoutputs o (cost=0.00..16.05 rows=3 width=4) Index Cond: (path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'::text) -> Index Scan using indexbuildsonjobsetidfinishedid on builds b (cost=0.43..8.45 rows=1 width=56) Index Cond: ((id = o.build) AND (finished = 1)) Filter: ((buildstatus = 0) OR (buildstatus = 6)) (6 rows) For direct comparison, the overall query plan was changed: From: Gather (cost=1000.43..33718.98 rows=2 width=56) To: Nested Loop (cost= 0.43.....41.41 rows=2 width=56) and the query plan for buildoutputs changed from a maximum cost of 32,710 down to 16. In practical terms, the query's planning and execution time was reduced: Before (ms) | Try 1 | Try 2 | Try 3 ------------+---------+---------+-------- Planning | 0.898 | 0.416 | 0.383 Execution | 138.644 | 172.331 | 375.585 After (ms) | Try 1 | Try 2 | Try 3 ------------+---------+---------+-------- Planning | 0.298 | 0.290 | 0.296 Execution | 219.625 | 0.035 | 0.034 |
||
---|---|---|
.github | ||
datadog | ||
doc | ||
examples | ||
foreman | ||
src | ||
tests | ||
.gitignore | ||
bootstrap | ||
configure.ac | ||
COPYING | ||
default.nix | ||
flake.lock | ||
flake.nix | ||
hydra-api.yaml | ||
hydra-module.nix | ||
INSTALL | ||
Makefile.am | ||
Procfile | ||
README.md | ||
shell.nix | ||
version |
Hydra
Hydra is a Continuous Integration service for Nix based projects.
Installation And Setup
Note: The instructions provided below are intended to enable new users to get a simple, local installation up and running. They are by no means sufficient for running a production server, let alone a public instance.
Enabling The Service
Running Hydra is currently only supported on NixOS. The hydra module allows for an easy setup. The following configuration can be used for a simple setup that performs all builds on localhost (Please refer to the Options page for all available options):
{
services.hydra = {
enable = true;
hydraURL = "http://localhost:3000";
notificationSender = "hydra@localhost";
buildMachinesFiles = [];
useSubstitutes = true;
};
}
Creating An Admin User
Once the Hydra service has been configured as above and activate you should already be able to access the UI interface at the specified URL. However some actions require an admin user which has to be created first:
$ su - hydra
$ hydra-create-user <USER> --full-name '<NAME>' \
--email-address '<EMAIL>' --password <PASSWORD> --role admin
Afterwards you should be able to log by clicking on "Sign In" on the top right of the web interface using the credentials specified by hydra-crate-user
. Once you are logged in you can click "Admin -> Create Project" to configure your first project.
Creating A Simple Project And Jobset
In order to evaluate and build anything you need to crate projects that contain jobsets. Hydra supports imperative and declarative projects and many different configurations. The steps below will guide you through the required steps to creating a minimal imperative project configuration.
Creating A Project
Log in as adminstrator, click "Admin" and select "Create project". Fill the form as follows:
- Identifier:
hello
- Display name:
hello
- Description:
hello project
Click "Create project".
Creating A Jobset
After creating a project you are forwarded to the project page. Click "Actions" and choose "Create jobset". Fill the form with the following values:
- Identifier:
hello
- Nix expression:
examples/hello.nix
inhydra
- Check interval: 60
- Scheduling shares: 1
We have to add two inputs for this jobset. One for nixpkgs and one for hydra (which we are referrencing in the Nix expression above):
-
Input name:
nixpkgs
-
Type:
Git checkout
-
Value:
https://github.com/nixos/nixpkgs-channels nixos-20.03
-
Input name:
hydra
-
Type:
Git checkout
-
Value:
https://github.com/nixos/hydra
Make sure State at the top of the page is set to "Enabled" and click on "Create jobset". This concludes the creation of a jobset that evaluates ./examples/hello.nix once a minute. Clicking "Evaluations" should list the first evaluation of the newly created jobset after a brief delay.
Building And Developing
Building Hydra
You can build Hydra via nix-build
using the provided default.nix:
$ nix-build
Development Environment
You can use the provided shell.nix to get a working development environment:
$ nix-shell
$ ./bootstrap
$ configurePhase # NOTE: not ./configure
$ make
Executing Hydra During Development
When working on new features or bug fixes you need to be able to run Hydra from your working copy. This can be done using foreman:
$ nix-shell
$ # hack hack
$ make
$ foreman start
Have a look at the Procfile if you want to see how the processes are being started. In order to avoid conflicts with services that might be running on your host, hydra and postgress are started on custom ports:
- hydra-server: 63333 with the username "alice" and the password "foobar"
- postgresql: 64444
Note that this is only ever meant as an ad-hoc way of executing Hydra during development. Please make use of the NixOS module for actually running Hydra in production.
JSON API
You can also interface with Hydra through a JSON API. The API is defined in hydra-api.yaml and you can test and explore via the swagger editor
Additional Resources
- Hydra User's Guide
- Hydra on the NixOS Wiki
- hydra-cli
- Peter Simons - Hydra: Setting up your own build farm (NixOS)
License
Hydra is licensed under GPL-3.0
Icons provided free by EmojiOne.