hydra/doc/architecture.md at a9a2679793a17325c966dec4cbb27d44b0531694

fricklerhandwerk 0803634a41 add architecture notes

meeting notes from @edolstra giving a one-hour tour of the code

2022-04-06 09:01:10 +02:00

5 KiB

Raw Blame History

This is a rough overview from informal discussions and explanations of inner workings of Hydra. You can use it as a guide to navigate the codebase or ask questions.

Architecture

Components

Postgres database
- configuration
- build queue
  - what is already built
  - what is going to build
hydra-server
- Perl, Catalyst
- web frontend
hydra-evaluator
- Perl, C++
- fetches repositories
- evaluates job sets
  - pointers to a repository
- adds builds to the queue
hydra-queue-runner
- C++
- monitors the queue
- executes build steps
- uploads build results
  - copy to a Nix store
Nix store
- contains .drvs
- populated by hydra-evaluator
- read by hydra-queue-runner
destination Nix store
- can be a binary cache
- e.g. [cache.nixos.org](http://cache.nixos.org) or the same store again (for small Hydra instances)
plugin architecture
- extend evaluator for new kinds of repositories
  - e.g. fetch from git

Database Schema

https://github.com/NixOS/hydra/blob/master/src/sql/hydra.sql

Jobsets
- populated by calling Nix evaluator
- every Nix derivation in release.nix is a Job
- flake
  - URL to flake, if job is from a flake
  - single-point of configuration for flake builds
  - flake itself contains pointers to dependencies
  - for other builds we need more configuration data
JobsetInputs
- more configuration for a Job
JobsetInputAlts
- historical, where you could have more than one alternative for each input
- it would have done the cross product of all possibilities
- not used any more, as now every input is unique
- originally that was to have alternative values for the system parameter
  - x86-linux, x86_64-darwin
  - turned out not to be a good idea, as job set names did not uniquely identify output
Builds
- queue: scheduled and finished builds
- instance of a Job
- corresponds to a top-level derivation
  - can have many dependencies that don’t have a corresponding build
  - dependencies represented as BuildSteps
- a Job is all the builds with a particular name, e.g.
  - git.x86_64-linux is a job
  - there maybe be multiple builds for that job
    - build ID: just an auto-increment number
- building one thing can actually cause many (hundreds of) derivations to be built
- for queued builds, the drv has to be present in the store
  - otherwise build will fail, e.g. after garbage collection
BuildSteps
- corresponds to a derivation or substitution
- are reused through the Nix store
- may be duplicated for unique derivations due to how they relate to Jobs
BuildStepOutputs
- corresponds directly to derivation outputs
  - out, dev, ...
BuildProducts
- not a Nix concept
- populated from a special file $out/nix-support/hydra-build-producs
- used to scrape parts of build results out to the web frontend
  - e.g. manuals, ISO images, etc.
BuildMetrics
- scrapes data from magic location, similar to BuildProducts to show fancy graphs
  - e.g. test coverage, build times, CPU utilization for build
- $out/nix-support/hydra-metrics
BuildInputs
- probably obsolute
JobsetEvalMembers
- joins evaluations with jobs
- huge table, 10k’s of entries for one nixpkgs evaluation
- can be imagined as a subset of the eval cache
  - could in principle use the eval cache

`release.nix`

hydra-specific convention to describe the build
should evaluate to an attribute set that contains derivations
hydra considers every attribute in that set a job
every job needs a unique name
- if you want to build for multiple platforms, you need to reflect that in the name
hydra does a deep traversal of the attribute set
- just evaluating the names may take half an hour

FAQ

Can we imagine Hydra to be a persistence layer for the build graph?

partially, it lacks a lot of information
- does not keep edges of the build graph

How does Hydra relate to nix build?

reimplements the top level Nix build loop, scheduling, etc.
Hydra has to persist build results
Hydra has more sophisticated remote build execution and scheduling than Nix

Is it conceptually possible to unify Hydra’s capabilities with regular Nix?

Nix does not have any scheduling, it just traverses the build graph
Hydra has scheduling in terms of job set priorities, tracks how much of a job set it has worked on
- makes sure jobs don’t starve each other
Nix cannot dynamically add build jobs at runtime
- RFC 92 should enable that
- internally it is already possible, but there is no interface to do that
Hydra queue runner is a long running process
- Nix takes a static set of jobs, working it off at once

5 KiB Raw Blame History Unescape Escape

Architecture

Components

Database Schema

release.nix

FAQ

5 KiB

Raw Blame History

`release.nix`