SSH connection dropping when copying outputs should not abort the entire build tree #922

Closed
opened 2025-07-19 07:16:01 +00:00 by k900 · 9 comments
Member

When copying outputs from a remote builder on a bad connection, and the connection drops midway, the entire build tree is aborted with "truncated NAR encountered", followed by "some outputs are unexpectedly invalid".

The actual error is fine - the connection did drop, and we can't recover from that; however, killing all the concurrently running builds is bad.

When copying outputs from a remote builder on a bad connection, and the connection drops midway, the entire build tree is aborted with "truncated NAR encountered", followed by "some outputs are unexpectedly invalid". The actual error is fine - the connection _did_ drop, and we can't recover from that; however, killing all the concurrently running builds is bad.
Owner

duplicate of #878?

duplicate of #878?
Author
Member

Nope, this kills everything even with --keep-going.

Nope, this kills everything even with --keep-going.
Owner

well thank fuck this codebase has historically been a perfect fount of consistent behavior

well thank fuck this codebase has historically been a perfect fount of consistent behavior
Owner

re: E/reproducible, I don't think anyone of us succeded into doing a repro or having a clear reproducer. Until we obtain this, this is going to be hard to action.

re: `E/reproducible`, I don't think anyone of us succeded into doing a repro or having a clear reproducer. Until we obtain this, this is going to be hard to action.
Owner

this may just be #928

this may just be #928
Owner

@k900 tried to reproduce today and failed at reproduction, I don't know if they were using the #928 fixed revision or not.

@k900 tried to reproduce today and failed at reproduction, I don't know if they were using the #928 fixed revision or not.
Member

Seen on nix (Lix, like Nix) 2.94.0-devpre20250723_020751c.

The output on the remote builder wasn't written, but the coordinator was convinced it could pull it.

Had to --repair store paths on a number of machines, now waiting if it reappears.

Seen on `nix (Lix, like Nix) 2.94.0-devpre20250723_020751c`. The output on the remote builder wasn't written, but the coordinator was convinced it could pull it. Had to `--repair` store paths on a number of machines, now waiting if it reappears.
Member

Not seen again since.

Not seen again since.
raito added this to the 2.96 milestone 2026-03-01 11:25:49 +00:00
Owner

Given the absence of reproducer, I'm closing. Please feel free to reopen if this reoccurs.

Given the absence of reproducer, I'm closing. Please feel free to reopen if this reoccurs.
raito closed this issue 2026-03-03 02:29:38 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
4 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lix-project/lix#922
No description provided.