Hydra, for Lix

Find a file

Graham Christensen bc4b96d053 BuildOutputs: index path with HASH Looking at AWS' Performance Insights for a Hydra instance, I found the hydra-queue-runner's query: select id, buildStatus, releaseName, closureSize, size from Builds b join BuildOutputs o on b.id = o.build where finished = ? and (buildStatus = ? or buildStatus = ?) and path = $1 was the slowest query by at least 10x. Running an explain on this showed why: hydra=> explain select id, buildStatus, releaseName, closureSize, size from Builds b join BuildOutputs o on b.id = o.build where finished = 1 and (buildStatus = 0 or buildStatus = 6) and path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'; QUERY PLAN ------------------------------------------------------------------------ Gather (cost=1000.43..33718.98 rows=2 width=56) Workers Planned: 2 -> Nested Loop (cost=0.43..32718.78 rows=1 width=56) -> Parallel Seq Scan on buildoutputs o (cost=0.00..32710.32 rows=1 width=4) Filter: (path = '/nix/store/s93kh...snip...'::text) -> Index Scan using indexbuildsonjobsetidfinishedid on builds b (cost=0.43..8.45 rows=1 width=56) Index Cond: ((id = o.build) AND (finished = 1)) Filter: ((buildstatus = 0) OR (buildstatus = 6)) (8 rows) A paralell sequential scan is definitely better than a sequential scan, but the cost ranging from 0 to 32710 is not great. Looking at the table, I saw the `path` column is completely unindex: hydra=> \d buildoutputs Table "public.buildoutputs" Column \| Type \| Collation \| Nullable \| Default --------+---------+-----------+----------+--------- build \| integer \| \| not null \| name \| text \| \| not null \| path \| text \| \| not null \| Indexes: "buildoutputs_pkey" PRIMARY KEY, btree (build, name) Foreign-key constraints: "buildoutputs_build_fkey" FOREIGN KEY (build) REFERENCES builds(id) ON DELETE CASCADE Since we always do exact matches on the path and don't care about ordering, and since the path column is very high cardinality a `hash` index is a good candidate. Note that I did test a btree index and it performed similarly well, but slightly worse. After creating the index (this took about 10 seconds) on a test database: create index IndexBuildOutputsPath on BuildOutputs using hash(path); We get a significantly reduced cost: hydra=> explain select id, buildStatus, releaseName, closureSize, size hydra-> from Builds b join BuildOutputs o on b.id = o.build where hydra-> finished = 1 and (buildStatus = 0 or buildStatus = 6) and hydra-> path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'; QUERY PLAN ------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.43..41.41 rows=2 width=56) -> Index Scan using buildoutputs_path_hash on buildoutputs o (cost=0.00..16.05 rows=3 width=4) Index Cond: (path = '/nix/store/s93khs2dncf2cy273mbyr4fb4ns3db20-MIDIVisualizer-5.1'::text) -> Index Scan using indexbuildsonjobsetidfinishedid on builds b (cost=0.43..8.45 rows=1 width=56) Index Cond: ((id = o.build) AND (finished = 1)) Filter: ((buildstatus = 0) OR (buildstatus = 6)) (6 rows) For direct comparison, the overall query plan was changed: From: Gather (cost=1000.43..33718.98 rows=2 width=56) To: Nested Loop (cost= 0.43.....41.41 rows=2 width=56) and the query plan for buildoutputs changed from a maximum cost of 32,710 down to 16. In practical terms, the query's planning and execution time was reduced: Before (ms) \| Try 1 \| Try 2 \| Try 3 ------------+---------+---------+-------- Planning \| 0.898 \| 0.416 \| 0.383 Execution \| 138.644 \| 172.331 \| 375.585 After (ms) \| Try 1 \| Try 2 \| Try 3 ------------+---------+---------+-------- Planning \| 0.298 \| 0.290 \| 0.296 Execution \| 219.625 \| 0.035 \| 0.034		2021-01-18 11:28:05 -05:00
.github	Convert validate-openapi to a Hydra job	2021-01-03 18:47:05 +01:00
datadog	add space	2017-07-26 16:56:16 +01:00
doc	manual: Fix XML	2020-09-13 17:52:18 +02:00
examples	Extend Setup Information	2020-05-02 16:04:20 +02:00
foreman	foreman/queue runner: run locally to avoid trust issues	2020-09-02 12:35:18 -04:00
src	BuildOutputs: index path with HASH	2021-01-18 11:28:05 -05:00
tests	Make PathInput plugin cache validity configurable	2020-06-04 12:26:47 +02:00
.gitignore	Ignore 'nix develop' outputs directory	2020-10-28 13:41:34 +01:00
bootstrap	hydra: Simplify `bootstrap'.	2011-01-14 10:52:47 +00:00
configure.ac	Remove outdated email address	2020-03-31 22:18:46 +02:00
COPYING	hydra: revert license change	2010-03-29 14:16:46 +00:00
default.nix	Simplify default.nix and shell.nix	2020-06-17 19:19:55 +02:00
flake.lock	flake.lock: Update	2021-01-03 18:17:05 +01:00
flake.nix	Convert validate-openapi to a Hydra job	2021-01-03 18:47:05 +01:00
hydra-api.yaml	Add endpoint to generate a shields.io badge	2020-12-25 15:05:34 +01:00
hydra-module.nix	Fix issue #614 : restart queue/evaluator on sufficient disk space avai… (#777 )	2020-07-27 15:46:57 -04:00
INSTALL	hydra: use autoconf/-make	2010-09-30 14:29:15 +00:00
Makefile.am	Install hydra-module.nix into $out/share/nix	2013-07-28 11:24:31 -04:00
Procfile	Add hydra-notify to devshell	2020-05-20 15:38:31 -04:00
README.md	readme: note the default user/pass	2020-09-02 12:35:18 -04:00
shell.nix	Simplify default.nix and shell.nix	2020-06-17 19:19:55 +02:00
version	hydra: fix tarball build, add pre suffix to tarballs	2010-09-30 15:02:42 +00:00

README.md

Hydra

Hydra is a Continuous Integration service for Nix based projects.

Installation And Setup

Note: The instructions provided below are intended to enable new users to get a simple, local installation up and running. They are by no means sufficient for running a production server, let alone a public instance.

Enabling The Service

Running Hydra is currently only supported on NixOS. The hydra module allows for an easy setup. The following configuration can be used for a simple setup that performs all builds on localhost (Please refer to the Options page for all available options):

{
  services.hydra = {
    enable = true;
    hydraURL = "http://localhost:3000";
    notificationSender = "hydra@localhost";
    buildMachinesFiles = [];
    useSubstitutes = true;
  };
}

Creating An Admin User

Once the Hydra service has been configured as above and activate you should already be able to access the UI interface at the specified URL. However some actions require an admin user which has to be created first:

$ su - hydra
$ hydra-create-user <USER> --full-name '<NAME>' \
    --email-address '<EMAIL>' --password <PASSWORD> --role admin

Afterwards you should be able to log by clicking on "Sign In" on the top right of the web interface using the credentials specified by hydra-crate-user. Once you are logged in you can click "Admin -> Create Project" to configure your first project.

Creating A Simple Project And Jobset

In order to evaluate and build anything you need to crate projects that contain jobsets. Hydra supports imperative and declarative projects and many different configurations. The steps below will guide you through the required steps to creating a minimal imperative project configuration.

Creating A Project

Identifier: hello
Display name: hello
Description: hello project

Click "Create project".

Creating A Jobset

After creating a project you are forwarded to the project page. Click "Actions" and choose "Create jobset". Fill the form with the following values:

Identifier: hello
Nix expression: examples/hello.nix in hydra
Check interval: 60
Scheduling shares: 1

We have to add two inputs for this jobset. One for nixpkgs and one for hydra (which we are referrencing in the Nix expression above):

Input name: nixpkgs
Type: Git checkout
Value: https://github.com/nixos/nixpkgs-channels nixos-20.03
Input name: hydra
Type: Git checkout
Value: https://github.com/nixos/hydra

Make sure State at the top of the page is set to "Enabled" and click on "Create jobset". This concludes the creation of a jobset that evaluates ./examples/hello.nix once a minute. Clicking "Evaluations" should list the first evaluation of the newly created jobset after a brief delay.

Building And Developing

Building Hydra

You can build Hydra via nix-build using the provided default.nix:

$ nix-build

Development Environment

You can use the provided shell.nix to get a working development environment:

$ nix-shell
$ ./bootstrap
$ configurePhase # NOTE: not ./configure
$ make

Executing Hydra During Development

When working on new features or bug fixes you need to be able to run Hydra from your working copy. This can be done using foreman:

$ nix-shell
$ # hack hack
$ make
$ foreman start

Have a look at the Procfile if you want to see how the processes are being started. In order to avoid conflicts with services that might be running on your host, hydra and postgress are started on custom ports:

hydra-server: 63333 with the username "alice" and the password "foobar"
postgresql: 64444

Note that this is only ever meant as an ad-hoc way of executing Hydra during development. Please make use of the NixOS module for actually running Hydra in production.

JSON API

You can also interface with Hydra through a JSON API. The API is defined in hydra-api.yaml and you can test and explore via the swagger editor

Additional Resources

License

Hydra is licensed under GPL-3.0

Icons provided free by EmojiOne.