forked from lix-project/hydra
build-remote: copy missing paths from the binary cache to localhost
In a Hydra instance I saw: possibly transient failure building ‘/nix/store/X.drv’ on ‘localhost’: dependency '/nix/store/Y' of '/nix/store/Y.drv' does not exist, and substitution is disabled This is confusing because the Hydra in question does have substitution enabled. This instance uses: keep-outputs = true keep-derivations = true and an S3 binary cache which is not configured as a substituter in the nix.conf. It appears this instance encountered a situation where store path Y was built and present in the binary cache, and Y.drv was GC rooted on the instance, however Y was not on the host. When Hydra would try to build this path locally, it would look in the binary cache to see if it was cached: (nix) 439 bool valid = isValidPathUncached(storePath); 440 441 if (diskCache && !valid) 442 // FIXME: handle valid = true case. 443 diskCache->upsertNarInfo(getUri(), hashPart, 0); 444 445 return valid; Since it was cached, the store path was considered Valid. The queue monitor would then not put this input in for substitution, because the path is valid: (hydra) 470 if (!destStore->isValidPath(*i.second.path(*localStore, step->drv->name, i.first))) { 471 valid = false; 472 missing.insert_or_assign(i.first, i.second); 473 } Hydra appears to correctly handle the case of missing paths that need to be substituted from the binary cache already, but since most Hydra instances use `keep-outputs` *and* all paths in the binary cache originate from that machine, it is not common for a path to be cached and not GC rooted locally. I'll run Hydra with this patch for a while and see if we run in to the problem again. A big thanks to John Ericson who helped debug this particular issue.
This commit is contained in:
parent
952f629b7c
commit
7e9e82398d
|
@ -271,7 +271,11 @@ void State::buildRemote(ref<Store> destStore,
|
||||||
}
|
}
|
||||||
|
|
||||||
/* Copy the input closure. */
|
/* Copy the input closure. */
|
||||||
if (!machine->isLocalhost()) {
|
if (machine->isLocalhost()) {
|
||||||
|
StorePathSet closure;
|
||||||
|
destStore->computeFSClosure(inputs, closure);
|
||||||
|
copyPaths(*destStore, *localStore, closure, NoRepair, NoCheckSigs, NoSubstitute);
|
||||||
|
} else {
|
||||||
auto mc1 = std::make_shared<MaintainCount<counter>>(nrStepsWaiting);
|
auto mc1 = std::make_shared<MaintainCount<counter>>(nrStepsWaiting);
|
||||||
mc1.reset();
|
mc1.reset();
|
||||||
MaintainCount<counter> mc2(nrStepsCopyingTo);
|
MaintainCount<counter> mc2(nrStepsCopyingTo);
|
||||||
|
|
Loading…
Reference in a new issue