In a Hydra instance I saw:
possibly transient failure building ‘/nix/store/X.drv’ on ‘localhost’:
dependency '/nix/store/Y' of '/nix/store/Y.drv' does not exist,
and substitution is disabled
This is confusing because the Hydra in question does have substitution enabled.
This instance uses:
keep-outputs = true
keep-derivations = true
and an S3 binary cache which is not configured as a substituter in the nix.conf.
It appears this instance encountered a situation where store path Y was built
and present in the binary cache, and Y.drv was GC rooted on the instance,
however Y was not on the host.
When Hydra would try to build this path locally, it would look in the binary
cache to see if it was cached:
(nix)
439 bool valid = isValidPathUncached(storePath);
440
441 if (diskCache && !valid)
442 // FIXME: handle valid = true case.
443 diskCache->upsertNarInfo(getUri(), hashPart, 0);
444
445 return valid;
Since it was cached, the store path was considered Valid.
The queue monitor would then not put this input in for substitution, because
the path is valid:
(hydra)
470 if (!destStore->isValidPath(*i.second.path(*localStore, step->drv->name, i.first))) {
471 valid = false;
472 missing.insert_or_assign(i.first, i.second);
473 }
Hydra appears to correctly handle the case of missing paths that need
to be substituted from the binary cache already, but since most
Hydra instances use `keep-outputs` *and* all paths in the binary cache
originate from that machine, it is not common for a path to be cached
and not GC rooted locally.
I'll run Hydra with this patch for a while and see if we run in to the
problem again.
A big thanks to John Ericson who helped debug this particular issue.
I'm not sure this is a good implementation as-is. It does work,
but the password gets echo'd to the screen. I tried to use IO::Prompt
but IO::Prompt really seems to want to read the password from ARGV.
Deleting jobsets first would fail because buildmetrics has an FK
to the jobset. However, the jobset / project relationship is not
marked as CASCADE.
Deleting all the builds automatically cascades to delete
buildmetrics, so deleting the relevant builds first, then deleting
the jobset solves it.
When having a builder like this in `/etc/nix/machines`
ssh://mfbuild?remote-store=/home/bosch/store
Hydra cannot build there since it tries to pass the entire value to
`ssh(1)` which doesn't work. Also, an alternate store-location is e.g.
used if the user isn't a trusted user on the remote system and thus
cannot use `/nix/store`.
If such a URI is given, Hydra will now add a `--store /home/bosch/store`
to the `ssh`-command to select the appropriate location remotely.
Using it causes database information to get fixated early, before tests can set a
new database. We only used it in one case, and that is an absolute reference anyway. The
tests for channel generation are passing, and that uses
[requireLocalStore, so this should be fine.
I'm honestly too lazy to create two commits for fixing these one-line
issues so here's one.
The first hunk fixes the name of the projectName input. This is relevant
now because it gets logged and the log message looks stupid when there
is an input without a name.
The second hunk fixes a warning when using declarative non-flake
jobsets. The implementation may look weird but it's just the same as the
logical implication operator of nix.
The indentation in the hydra.conf makes it possible to include multi-line
strings without it being likely that the contents of the tracker
is mis-parsed or interrupts tho config parser.
It isn't impossible / foolproof probably, but it shouldn't be likely.
Currently we only track how long individual plugins take.
With #1083 we stop executing a lot of plugins, but we
don't have a way to measure its practical impact on the
execution time of handling events.
This happens with flake jobsets for obvious reasons (namely, that nixexprinput
and nixexprpath may be undefined for a flake jobset).
12:38:59 hydra-evaluator.1 | Use of uninitialized value $args[0] in join or string at /home/vin/workspace/vcs/hydra/src/script/hydra-eval-jobset line 648.
12:38:59 hydra-evaluator.1 | Use of uninitialized value $args[1] in join or string at /home/vin/workspace/vcs/hydra/src/script/hydra-eval-jobset line 648.
11:38:20 hydra-server.1 | DEPRECATION WARNING: The Regex dispatch type is deprecated.
11:38:20 hydra-server.1 | It is recommended that you convert Regex and LocalRegex
11:38:20 hydra-server.1 | methods to Chained methods. at /nix/store/aa6gw57fnahd4824pbhmvcs0jlypmynq-hydra-perl-deps/lib/perl5/site_perl/5.32.1/Catalyst/DispatchType/Regex.pm line 210.
12:34:12 hydra-server.1 | Use of uninitialized value $s in substitution (s///) at /home/vin/workspace/vcs/hydra/src/script/../lib/Hydra/Helper/CatalystUtils.pm line 283, <$fh> line 1.
11:28:20 hydra-server.1 | [warn] Unicode::Encoding plugin is auto-applied, please remove this from your appclass and make sure to define "encoding" config
12:10:15 hydra-notify.1 | %channels_to_events{...} in scalar context better written as $channels_to_events{...} at /home/vin/workspace/vcs/hydra/src/lib/Hydra/Event.pm line 20.
At the moment, aggregate jobs can easily break and cause the entire
evaluation to fail, which is not ideal. For Nixpkgs, we do have some
important aggregate jobs (like `tested`), but for debugging and building
purposes it's still useful to get a partial result even if the channel
won't actually advance.
This commit changes the behaviour of hydra-eval-jobs such that it
aggregates any errors found during the construction of an aggregate, and
will instead annotate the job with the evaluation failure such that it
shows up in a "cleaner" way.
There are really two types of failure that we care about: one is where
the attribute just ends up missing altogether in the final output, and
also where the attribute is in the output but fails to evaluate. Both
are handled here.
Note that this does mean that the same error message may be output
multiple times, but this aids debuggability because it'll be much
clearer what's blocking the job from being created.
Fix parsing breakage from #1003: assigning the lines to $lines broke chomp and the filters.
This test validates the parsing works as expected, and also fixes
a minor bug where '-' in features isn't pruned, like in the C++
repo.
Fixes
mv: cannot move './static/bootstrap-4.3.1-dist' to './static/bootstrap/bootstrap-4.3.1-dist': Directory not empty
when 'make' is called more than once.
When I take a look at *all* failing builds (by clicking at `[...] more
jobs omitted`) and I try to compare the failures to another jobset, I'd
like to still view *all* failing builds in the compare-view.
This wasn't the case before since the `full=`-param was ignored by the
compare-buttons.
Yet again, manual testing is proving to be insufficient. I'm pretty
sure I wrote this code but lost it in a rebase, or perhaps the switch
to result classes.
At any rate, this implements the actual "fetch a retry row and run it"
for the hydra-notify daemon.
Tested by hand.
Get the number of seconds before the next retriable task is ready.
This number is specifically intended to be used as a timeout, where
`undef` means never time out.
This gives us a place to put helper functions that act on entire
tables, not just individual records.
This should be a backwards compatible change, except in places we're
manually using result class names.
Declarative jobsets were sort of tucked in to the event hanlder
itself. It turned out that it could have been implemented as a
plugin without much trouble.
Without this commit, two jobsets using the same repository as input,
but different `deepClone` options, end up incorrectly sharing the same
"checkout" for a given (`uri`, `branch`, `revision`) tuple. The
presence or absence of `.git` is determined by the jobset execution
order.
This patch adds the missing `isDeepClone` boolean to the cache key.
The database upgrade script empties the `CachedGitInputs` table, as we
don't know if existing checkouts are deep clones. Unfortunately, this
generally forces rebuilds even for correct `deepClone` checkouts, as
the binary contents of `.git` are not deterministic.
Fixes#510
I broke this when I added `me.` in f1e75c8bff
I added me. to disambiguate `id`, but:
* eval.id works on the per-build page
* me.id works on the other pages
* Just id works everywhere if I drop:
, prefetch => { evaluationerror => [ ] },
but this causes a query per row to collect the evaluationerror
records later, this becomes significantly slow on non-trivial
datasets.
Using evals->current_source_alias will use the correct alias
whether it is me or eval or something else.
Exposes metrics:
* http_request_duration_seconds_bucket
* http_request_size_bytes_bucket
* http_response_size_bytes_bucket
* http_requests_total
with labels of action and controller to help identify popular
endpoints and their performance characteristics.
If the project isn't declarative, who cares about it in the response? After
setting the `declfile` to an empty string, everything related to declarative-
ness is wiped out, anyways.
It appears the Jobs table was removed in
8adb433e3b, but the Jobsets schema was never
updated to reflect this. This relationship was added in
efa1f1d4fb, roughly 3 months prior.
Previously, one would see a message similar to the following logged when
deleting a jobset:
17:38:23 hydra-server.1 | DBIx::Class::Relationship::CascadeActions::delete(): Skipping cascade delete on relationship 'jobs' - related resultsource 'Hydra::Schema::Jobs' is not registered with this schema at /home/vin/workspace/vcs/hydra/src/script/../lib/Hydra/Controller/Jobset.pm line 106
Something in the upgrade of Bootstrap and JQuery broke lazy tab loading.
I don't understand what is providing the tab behavior, how it should
work, or what the correct fix is.
I can tell you that this patch fixes the issue: when loading a tab
with a URL fragment deep-linking to a lazily loaded tab... it now
loads.
Close#959
This appears to have been broken in ac3e8a4a59,
which removed the `jobsetevals` column from the Projects schema, but didn't
update the Controller accordingly.
Fixes the test added in the previous commit.
To further align with the API, we return custom JSON in order to display a
`visible` field rather than `hidden` -- a `PUT` request expects `visible`, while
a `GET` request returns `hidden`.
This also allows us to rename the `jobsetinputs` field to `inputs` for the same
reason: `PUT` expects `inputs`, while `GET` returns `jobsetinputs`.
`PUT /jobsets/{project-id}/{jobset-id}` expects a JSON object `inputs` which
maps a name to a name, a type, a value, and a boolean that enables emailing
responsible parties. However, `GET /jobsets/{project-id}/{jobset-id}` responds
with an object that doesn't contain a value, but does contain a jobsetinputalts
(which is old and should be unused).
This commit aligns the two by removing the old and unused `jobsetinputalts` from
the response and replaces it with `value`.