Make sure that the derivation we send to the remote builder is exactly
the one that we want to build locally so that the output ids are exactly
the same
Fix#4845
This is technically a breaking change, since attempting to set plugin
files after the first non-flag argument will now throw an error. This
is acceptable given the relative lack of stability in a plugin
interface and the need to tie the knot somewhere once plugins can
actually define new subcommands.
- Pass it the name of the outputs rather than their output paths (as
these don't exist for ca derivations)
- Get the built output paths from the remote builder
- Register the new received realisations
Changes:
* The divider lines are gone. These were in practice a bit confusing,
in particular with --show-trace or --keep-going, since then there
were multiple lines, suggesting a start/end which wasn't the case.
* Instead, multi-line error messages are now indented to align with
the prefix (e.g. "error: ").
* The 'description' field is gone since we weren't really using it.
* 'hint' is renamed to 'msg' since it really wasn't a hint.
* The error is now printed *before* the location info.
* The 'name' field is no longer printed since most of the time it
wasn't very useful since it was just the name of the exception (like
EvalError). Ideally in the future this would be a unique, easily
googleable error ID (like rustc).
* "trace:" is now just "…". This assumes error contexts start with
something like "while doing X".
Example before:
error: --- AssertionError ---------------------------------------------------------------------------------------- nix
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
assertion 'false' failed
----------------------------------------------------- show-trace -----------------------------------------------------
trace: while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
Example after:
error: assertion 'false' failed
at: (7:7) in file: /home/eelco/Dev/nixpkgs/pkgs/applications/misc/hello/default.nix
6|
7| x = assert false; 1;
| ^
8|
… while evaluating the attribute 'x' of the derivation 'hello-2.10'
at: (192:11) in file: /home/eelco/Dev/nixpkgs/pkgs/stdenv/generic/make-derivation.nix
191| // (lib.optionalAttrs (!(attrs ? name) && attrs ? pname && attrs ? version)) {
192| name = "${attrs.pname}-${attrs.version}";
| ^
193| } // (lib.optionalAttrs (stdenv.hostPlatform != stdenv.buildPlatform && !dontAddHostSuffix && (attrs ? name || (attrs ? pname && attrs ? version)))) {
This seems more correct. It also means one can specify the features a
store should support with --store and remote-store=..., which is useful.
I use this to clean up the build remotes test.
Most functions now take a StorePath argument rather than a Path (which
is just an alias for std::string). The StorePath constructor ensures
that the path is syntactically correct (i.e. it looks like
<store-dir>/<base32-hash>-<name>). Similarly, functions like
buildPaths() now take a StorePathWithOutputs, rather than abusing Path
by adding a '!<outputs>' suffix.
Note that the StorePath type is implemented in Rust. This involves
some hackery to allow Rust values to be used directly in C++, via a
helper type whose destructor calls the Rust type's drop()
function. The main issue is the dynamic nature of C++ move semantics:
after we have moved a Rust value, we should not call the drop function
on the original value. So when we move a value, we set the original
value to bitwise zero, and the destructor only calls drop() if the
value is not bitwise zero. This should be sufficient for most types.
Also lots of minor cleanups to the C++ API to make it more modern
(e.g. using std::optional and std::string_view in some places).
It could happen that the local builder match the system but lacks some features.
Now it results a failure.
The fix gracefully excludes the local builder from the set of available builders for derivation which requires the feature, so the derivation is built on remote builders only (as though it has incompatible system, like ```aarch64-linux``` when local is x86)
The existing ordering linked `libutil` before `libstore`, which causes
link failures when building statically. This is due to `libstore` using
functions from `libutil`, and the fact that symbol resolution works
"forward" - i.e. if you pass `-lfoo -lbar -lbaz`, any symbols that
`libbar` uses from `libbaz` will be resolved, but symbols from `libfoo`
will not since it comes first in the command line.
All this to say: this commit reorders the libraries which fixes the link
errors.
E.g.
cannot build on 'ssh://mac1': cannot connect to 'mac1': bash: nix-store: command not found
cannot build on 'ssh://mac2': cannot connect to 'mac2': Host key verification failed.
cannot build on 'ssh://mac3': cannot connect to 'mac3': Received disconnect from 213... port 6001:2: Too many authentication failures
Authentication failed.
The storeUri variable in the build-remote hook is declared very much to
the start of the main function and a bunch of lines later, the same
variable gets checked via hasPrefix() but it gets assigned *after* that
check when the most suitable machine for the build was choosen.
So I guess this was just a typo in d16fd24973
and what we really want is to either checkd the prefix *after* assigning
storeUri or use bestMachine->storeUri directly.
I choose the latter, because the former could introduce even more
regressions if the try block where the variable gets assigned terminates
early.
Nevertheless, the reason why the log output didn't work is because
hasPrefix() checked for "ssh://" in front of storeUri, but if the
storeUri isn't set correctly (or at all), we don't get the log file
descriptor set up properly, leading to no log output.
I've adjusted the remote-builds test to include a regression test for
this, so that we can make sure we get a build output when using remote
builds.
In addition to that I've tested this with two of my build farms and the
build logs are emitted correctly again.
Signed-off-by: aszlig <aszlig@nix.build>