This reverts commit 69b8f9980f.
The timeout should be enforced remotely. Otherwise, if the garbage
collector is running either locally or remotely, if will block the
build or closure copying for some time. If the garbage collector
takes too long, the build may time out, which is not what we want.
Also, on heavily loaded systems, copying large paths to and from the
remote machine can take a long time, also potentially resulting in a
timeout.
The "$UID != 0" makes no sense: if the local side has write access to
the Nix store (which is always the case) then it doesn't matter if
we're root - we can import unsigned paths either way.
Otherwise it will set the parent's stdin to non-blocking mode, causing
the subsequent read of the set of inputs/outputs to fail randomly.
That's insane.
Before selecting a machine, build-remote.pl will try to run the
command "nix-builds-inhibited" on the machine. If this command exists
and returns a 0 exit code, then the machine won't be used. It's up to
the user to provide this command, but it would typically be a script
that checks whether there is enough disk space and whether the load is
not too high.
Don't pass --timeout / --max-silent-time to the remote builder.
Instead, let the local Nix process terminate the build if it exceeds a
timeout. The remote builder will be killed as a side-effect. This
gives better error reporting (since the timeout message from the
remote side wasn't properly propagated) and handles non-Nix problems
like SSH hangs.
Note that this will only work if the client has a very recent Nix
version (post 15e1b2c223), otherwise the
--option flag will just be ignored.
Fixes#50.
This handles the chroot and build hook cases, which are easy.
Supporting the non-chroot-build case will require more work (hash
rewriting!).
Issue #21.
Mandatory features are features that MUST be present in a derivation's
requiredSystemFeatures attribute. One application is performance
testing, where we have a dedicated machine to run performance tests
(and nothing else). Then we would add the label "perf" to the
machine's mandatory features and to the performance testing
derivations.
closure to a given machine at the same time. This prevents the case
where multiple instances try to copy the same missing store path to
the target machine, which is very wasteful.
‘nix-store --export’.
* Add a Perl module that provides the functionality of
‘nix-copy-closure --to’. This is used by build-remote.pl so it no
longer needs to start a separate nix-copy-closure process. Also, it
uses the Perl API to do the export, so it doesn't need to start a
separate nix-store process either. As a result, nix-copy-closure
and build-remote.pl should no longer fail on very large closures due
to an "Argument list too long" error. (Note that having very many
dependencies in a single derivation can still fail because the
environment can become too large. Can't be helped though.)
hook script proper, and the stdout/stderr of the builder. Only the
latter should be saved in /nix/var/log/nix/drvs.
* Allow the verbosity to be set through an option.
* Added a flag --quiet to lower the verbosity level.
it requires a certain feature on the build machine, e.g.
requiredSystemFeatures = [ "kvm" ];
We need this in Hydra to make sure that builds that require KVM
support are forwarded to machines that have KVM support. Probably
this should also be enforced for local builds.
the hook every time we want to ask whether we can run a remote build
(which can be very often), we now reuse a hook process for answering
those queries until it accepts a build. So if there are N
derivations to be built, at most N hooks will be started.
load on the Hydra build farm (where it's unnecessary anyway because
it has a fast connection to the build machines). In any case,
compression can be enabled by using the `-C' option to ssh.
giving jobs to the first machine until it hits its job limit, then
the second machine and so on. This should improve utilisation of
the Hydra build farm a lot. Also take an optional speed factor
into account to cause fast machines to be preferred over slower
machines with a similar load.
(that is, call the build hook with a certain interval until it
accepts the build).
* build-remote.pl was totally broken: for all system types other than
the local system type, it would send all builds to the *first*
machine of the appropriate type.