forked from lix-project/lix
2ec9d2fb02
Once upon a time, I wrote my bachelors thesis about functional deployment mechanisms. I had to evaluate several szenarios where package management and deployment were relevant. One szenario was to do distributed builds over several machines. I told myself: Weee, nix can do this! And with nix, this is actually save, as you do not have side effects when building! So I started. I use a cloud to set up four virtual machines where I wanted to do the build. A fifth machine was used as master to distribute the builds. All was good. I created the necessary SSH keys, made sure every machine was reachable by the master and configured the build in my remotes.conf. When I started to try to build weechat from source, the build failed. It failed, telling me error: unable to start any build; either increase ‘--max-jobs’ or enable distributed builds And I started to dig around. I digged long and good. But I wasn't able to find the issue. I double and triple checked my environment variables, my settings, the SSH key and everything. I reached out to fellow Nixers by asking on the nixos IRC channel. And I got help. But we weren't able to find the issue, either. So I became frustrated. I re-did all the environment variables. And suddenly,... it worked! What did I change? Well... I made the environment variables which contained pathes contain absolute pathes rather than relatives. And because I like to share my knowledge, this should be put into the documentation, so others do not bang their heads against the wall because something is not documented somewhere.
117 lines
5.5 KiB
XML
117 lines
5.5 KiB
XML
<chapter xmlns="http://docbook.org/ns/docbook"
|
||
xmlns:xlink="http://www.w3.org/1999/xlink"
|
||
xmlns:xi="http://www.w3.org/2001/XInclude"
|
||
version="5.0"
|
||
xml:id='chap-distributed-builds'>
|
||
|
||
<title>Distributed Builds</title>
|
||
|
||
<para>Nix supports distributed builds, where a local Nix installation can
|
||
forward Nix builds to other machines over the network. This allows
|
||
multiple builds to be performed in parallel (thus improving
|
||
performance) and allows Nix to perform multi-platform builds in a
|
||
semi-transparent way. For instance, if you perform a build for a
|
||
<literal>powerpc-darwin</literal> on an <literal>i686-linux</literal>
|
||
machine, Nix can automatically forward the build to a
|
||
<literal>powerpc-darwin</literal> machine, if available.</para>
|
||
|
||
<para>You can enable distributed builds by setting the environment
|
||
variable <envar>NIX_BUILD_HOOK</envar> to point to a program that Nix
|
||
will call whenever it wants to build a derivation. The build hook
|
||
(typically a shell or Perl script) can decline the build, in which Nix
|
||
will perform it in the usual way if possible, or it can accept it, in
|
||
which case it is responsible for somehow getting the inputs of the
|
||
build to another machine, doing the build there, and getting the
|
||
results back. The details of the build hook protocol are described in
|
||
the documentation of the <link
|
||
linkend="envar-build-hook"><envar>NIX_BUILD_HOOK</envar>
|
||
variable</link>.</para>
|
||
|
||
<example xml:id='ex-remote-systems'><title>Remote machine configuration:
|
||
<filename>remote-systems.conf</filename></title>
|
||
<programlisting>
|
||
nix@mcflurry.labs.cs.uu.nl powerpc-darwin /home/nix/.ssh/id_quarterpounder_auto 2
|
||
nix@scratchy.labs.cs.uu.nl i686-linux /home/nix/.ssh/id_scratchy_auto 8 1 kvm
|
||
nix@itchy.labs.cs.uu.nl i686-linux /home/nix/.ssh/id_scratchy_auto 8 2
|
||
nix@poochie.labs.cs.uu.nl i686-linux /home/nix/.ssh/id_scratchy_auto 8 2 kvm perf
|
||
</programlisting>
|
||
</example>
|
||
|
||
<para>Nix ships with a build hook that should be suitable for most
|
||
purposes. It uses <command>ssh</command> and
|
||
<command>nix-copy-closure</command> to copy the build inputs and
|
||
outputs and perform the remote build. To use it, you should set
|
||
<envar>NIX_BUILD_HOOK</envar> to
|
||
<filename><replaceable>prefix</replaceable>/libexec/nix/build-remote.pl</filename>.
|
||
You should also define a list of available build machines and point
|
||
the environment variable <envar>NIX_REMOTE_SYSTEMS</envar> to it (the
|
||
path has to be absolute, otherwise nix will fail to distribute the build). An
|
||
example configuration is shown in <xref linkend='ex-remote-systems'
|
||
/>. Each line in the file specifies a machine, with the following
|
||
bits of information:
|
||
|
||
<orderedlist>
|
||
|
||
<listitem><para>The name of the remote machine, with optionally the
|
||
user under which the remote build should be performed. This is
|
||
actually passed as an argument to <command>ssh</command>, so it can
|
||
be an alias defined in your
|
||
<filename>~/.ssh/config</filename>.</para></listitem>
|
||
|
||
<listitem><para>A comma-separated list of Nix platform type
|
||
identifiers, such as <literal>powerpc-darwin</literal>. It is
|
||
possible for a machine to support multiple platform types, e.g.,
|
||
<literal>i686-linux,x86_64-linux</literal>.</para></listitem>
|
||
|
||
<listitem><para>The SSH private key to be used to log in to the
|
||
remote machine. Since builds should be non-interactive, this key
|
||
should not have a passphrase!</para></listitem>
|
||
|
||
<listitem><para>The maximum number of builds that
|
||
<filename>build-remote.pl</filename> will execute in parallel on the
|
||
machine. Typically this should be equal to the number of CPU cores.
|
||
For instance, the machine <literal>itchy</literal> in the example
|
||
will execute up to 8 builds in parallel.</para></listitem>
|
||
|
||
<listitem><para>The “speed factor”, indicating the relative speed of
|
||
the machine. If there are multiple machines of the right type, Nix
|
||
will prefer the fastest, taking load into account.</para></listitem>
|
||
|
||
<listitem><para>A comma-separated list of <emphasis>supported
|
||
features</emphasis>. If a derivation has the
|
||
<varname>requiredSystemFeatures</varname> attribute, then
|
||
<filename>build-remote.pl</filename> will only perform the
|
||
derivation on a machine that has the specified features. For
|
||
instance, the attribute
|
||
|
||
<programlisting>
|
||
requiredSystemFeatures = [ "kvm" ];
|
||
</programlisting>
|
||
|
||
will cause the build to be performed on a machine that has the
|
||
<literal>kvm</literal> feature (i.e., <literal>scratchy</literal> in
|
||
the example above).</para></listitem>
|
||
|
||
<listitem><para>A comma-separated list of <emphasis>mandatory
|
||
features</emphasis>. A machine will only be used to build a
|
||
derivation if all of the machine’s mandatory features appear in the
|
||
derivation’s <varname>requiredSystemFeatures</varname> attribute.
|
||
Thus, in the example, the machine <literal>poochie</literal> will
|
||
only do derivations that have
|
||
<varname>requiredSystemFeatures</varname> set to <literal>["kvm"
|
||
"perf"]</literal> or <literal>["perf"]</literal>.</para></listitem>
|
||
|
||
</orderedlist>
|
||
|
||
You should also set up the environment variable
|
||
<envar>NIX_CURRENT_LOAD</envar> to point at a directory (e.g.,
|
||
<filename>/var/run/nix/current-load</filename>) that
|
||
<filename>build-remote.pl</filename> uses to remember how many builds
|
||
it is currently executing remotely. It doesn't look at the actual
|
||
load on the remote machine, so if you have multiple instances of Nix
|
||
running, they should use the same <envar>NIX_CURRENT_LOAD</envar>
|
||
file. Maybe in the future <filename>build-remote.pl</filename> will
|
||
look at the actual remote load.</para>
|
||
|
||
</chapter>
|