Periodically, I have seen tests fail because of out of order queue runner behavior:
checking the queue for builds > 0...
loading build 1 (tests:basic:empty_dir)
aborting unsupported build step '...-empty-dir.drv' (type 'x86_64-linux')
marking build 1 as failed
adding new machine ‘localhost’
This patch should prevent the dispatcher from running before any machines are
made available.
This in-progress feature will run a dynamically generated set of
buildFinished hooks, which must be nested under the `runCommandHook.*`
attribute set. This implementation is not very good, with some to-dos:
1. Only run if the build succeeded
2. Verify the output is named $out and that it is an executable file
(or a symlink to a file)
3. Require the jobset itself have a flag enabling the feature, since
this feature can be a bit dangerous if various people of different
trust levels can create the jobs.
This shouldn't be possible normally, but it is possible to:
$db->resultset('RunCommandLogs')->new({ uuid => "../etc/passwd" });
if you have access to the `$db`.
Also split it out to a new div -- there are now 3 lines per
RunCommandLog -- the first saying when it started, the second saying how
long it ran for (or has been running), and the third with the buttons
for the pretty, raw, and tail versions of the log.
This also adds the `runcommandlog` object to the stash so that we can
access its uuid as well as command run in order to display more useful
and specific information on the webpage.
Using a sha1 of the command combined with the build ID is not a
particularly good or unique identifier:
* A build could fail, be restarted, and then succeed -- assuming no
configuration changes, the sha1 hash of the command as well as the build
ID will be the same. This would lead to an overwritten log file.
* Allowing user input to influence filenames is not the best of ideas.
Since breaking the filename construction out to a helper function,
Hydra::Model::DB is no longer used. Importing Hydra::Helper::Nix,
however, has the potential to break tests, so just use the functions we
need without importing the entire module.
run3 just seems to do better handling for what we want to do, and
requires less deep-reaching changes to this plugin to get it to play
nice, as IPC::Run::run would.
This uses the somewhat restrictive umask of 0027 so that people outside
the user or group cannot read the files. This also helps to inhibit
TOCTOU where someone else has a handle to our file before we chmod it
and after we close it.
In a Hydra instance I saw:
possibly transient failure building ‘/nix/store/X.drv’ on ‘localhost’:
dependency '/nix/store/Y' of '/nix/store/Y.drv' does not exist,
and substitution is disabled
This is confusing because the Hydra in question does have substitution enabled.
This instance uses:
keep-outputs = true
keep-derivations = true
and an S3 binary cache which is not configured as a substituter in the nix.conf.
It appears this instance encountered a situation where store path Y was built
and present in the binary cache, and Y.drv was GC rooted on the instance,
however Y was not on the host.
When Hydra would try to build this path locally, it would look in the binary
cache to see if it was cached:
(nix)
439 bool valid = isValidPathUncached(storePath);
440
441 if (diskCache && !valid)
442 // FIXME: handle valid = true case.
443 diskCache->upsertNarInfo(getUri(), hashPart, 0);
444
445 return valid;
Since it was cached, the store path was considered Valid.
The queue monitor would then not put this input in for substitution, because
the path is valid:
(hydra)
470 if (!destStore->isValidPath(*i.second.path(*localStore, step->drv->name, i.first))) {
471 valid = false;
472 missing.insert_or_assign(i.first, i.second);
473 }
Hydra appears to correctly handle the case of missing paths that need
to be substituted from the binary cache already, but since most
Hydra instances use `keep-outputs` *and* all paths in the binary cache
originate from that machine, it is not common for a path to be cached
and not GC rooted locally.
I'll run Hydra with this patch for a while and see if we run in to the
problem again.
A big thanks to John Ericson who helped debug this particular issue.
I'm not sure this is a good implementation as-is. It does work,
but the password gets echo'd to the screen. I tried to use IO::Prompt
but IO::Prompt really seems to want to read the password from ARGV.
Deleting jobsets first would fail because buildmetrics has an FK
to the jobset. However, the jobset / project relationship is not
marked as CASCADE.
Deleting all the builds automatically cascades to delete
buildmetrics, so deleting the relevant builds first, then deleting
the jobset solves it.
When having a builder like this in `/etc/nix/machines`
ssh://mfbuild?remote-store=/home/bosch/store
Hydra cannot build there since it tries to pass the entire value to
`ssh(1)` which doesn't work. Also, an alternate store-location is e.g.
used if the user isn't a trusted user on the remote system and thus
cannot use `/nix/store`.
If such a URI is given, Hydra will now add a `--store /home/bosch/store`
to the `ssh`-command to select the appropriate location remotely.
Using it causes database information to get fixated early, before tests can set a
new database. We only used it in one case, and that is an absolute reference anyway. The
tests for channel generation are passing, and that uses
[requireLocalStore, so this should be fine.
I'm honestly too lazy to create two commits for fixing these one-line
issues so here's one.
The first hunk fixes the name of the projectName input. This is relevant
now because it gets logged and the log message looks stupid when there
is an input without a name.
The second hunk fixes a warning when using declarative non-flake
jobsets. The implementation may look weird but it's just the same as the
logical implication operator of nix.
The indentation in the hydra.conf makes it possible to include multi-line
strings without it being likely that the contents of the tracker
is mis-parsed or interrupts tho config parser.
It isn't impossible / foolproof probably, but it shouldn't be likely.
Currently we only track how long individual plugins take.
With #1083 we stop executing a lot of plugins, but we
don't have a way to measure its practical impact on the
execution time of handling events.
This happens with flake jobsets for obvious reasons (namely, that nixexprinput
and nixexprpath may be undefined for a flake jobset).
12:38:59 hydra-evaluator.1 | Use of uninitialized value $args[0] in join or string at /home/vin/workspace/vcs/hydra/src/script/hydra-eval-jobset line 648.
12:38:59 hydra-evaluator.1 | Use of uninitialized value $args[1] in join or string at /home/vin/workspace/vcs/hydra/src/script/hydra-eval-jobset line 648.
11:38:20 hydra-server.1 | DEPRECATION WARNING: The Regex dispatch type is deprecated.
11:38:20 hydra-server.1 | It is recommended that you convert Regex and LocalRegex
11:38:20 hydra-server.1 | methods to Chained methods. at /nix/store/aa6gw57fnahd4824pbhmvcs0jlypmynq-hydra-perl-deps/lib/perl5/site_perl/5.32.1/Catalyst/DispatchType/Regex.pm line 210.
12:34:12 hydra-server.1 | Use of uninitialized value $s in substitution (s///) at /home/vin/workspace/vcs/hydra/src/script/../lib/Hydra/Helper/CatalystUtils.pm line 283, <$fh> line 1.
11:28:20 hydra-server.1 | [warn] Unicode::Encoding plugin is auto-applied, please remove this from your appclass and make sure to define "encoding" config
12:10:15 hydra-notify.1 | %channels_to_events{...} in scalar context better written as $channels_to_events{...} at /home/vin/workspace/vcs/hydra/src/lib/Hydra/Event.pm line 20.
At the moment, aggregate jobs can easily break and cause the entire
evaluation to fail, which is not ideal. For Nixpkgs, we do have some
important aggregate jobs (like `tested`), but for debugging and building
purposes it's still useful to get a partial result even if the channel
won't actually advance.
This commit changes the behaviour of hydra-eval-jobs such that it
aggregates any errors found during the construction of an aggregate, and
will instead annotate the job with the evaluation failure such that it
shows up in a "cleaner" way.
There are really two types of failure that we care about: one is where
the attribute just ends up missing altogether in the final output, and
also where the attribute is in the output but fails to evaluate. Both
are handled here.
Note that this does mean that the same error message may be output
multiple times, but this aids debuggability because it'll be much
clearer what's blocking the job from being created.
Fix parsing breakage from #1003: assigning the lines to $lines broke chomp and the filters.
This test validates the parsing works as expected, and also fixes
a minor bug where '-' in features isn't pruned, like in the C++
repo.