Resource temporarily unavailable and stack trace on running nix copy to S3 bucket #1168
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
awaiting
author
awaiting
contributors
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
diagnostics
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
Importance
High
Importance
Low
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
Release Blocking
Non-urgent
Release Blocking
Urgent
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
Urgency
High
Urgency
Low
ux
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lix-project/lix#1168
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
While trying to copy data to my S3 bucket (
nc.benary.org) Lix as described below.I'm specifically running this here (to be consistent with what my Hydra most likely uses):
This happened many times before with different derivations copied at different times in the copy process, and it is usually enough to run it in a
until timeout -k 16 64 $command;do sleep 4;doneloop.I have recently messed up my S3 bucket while testing the new Ceph release, so there could be some malformed data on the other side, or there could be unexpected errors, however neither of those should lead to Lix giving me a full on stack trace.
I am still trying to reproduce the issue while running with
-vvvvv, but no luck so far.Opening this issue ahead of time for discoverability in case someone else is having similar issues, but also for the off chance that someone might already figure out something from the stack trace.
full exception/stack trace
It fails basically in the middle of it, with the progress being stuck on the start of the line and the rest being the first line of this text:
Steps To Reproduce
Unable to reproduce reliably.
Basically every once in a while my Hydra fails due some nar files missing from the bucket (which is what I'm trying to fix using the copy).
Upon then running the copy it tends (not always) to err out with the stack trace, however upon rerunning it continues to copy some files before failing again, until it successfully completes.
Expected behavior
Lix should either give me a tangible error which could help me fix whatever issue it is encountering, or succeed to begin with.
It most certainly should not print a stack trace (although to be fair that's still better than failing silently).
nix --versionoutputlix-2.96.0-dev-35b7765 (
35b7765on top of NixOS 25.11; I can switch to any git commit or add patches for testing)Additional context
The closest issue I found was a really old CppNix issue with different semantics and no stack trace, and the CppNix issue about the fork limit not being high enough but since mine is set to 1048576 this should not be related.
If you want/need a bucket for testing I can provide an HDD backed bucket with virtually no space limit running on the exact same Ceph cluster/radosgw instance.
I've just encountered a similar error which may be related, but I'm not entirely sure, since that one required a Ctrl+C specifically (thus triggering #1023), which I am fairly (but not 100%) certain that this issue did not trigger.
Anyway at some point during a different copy, the progress just hung seemingly forever with the
-vvvvvoutput showing a semblance of the following in a loop:However I can imagine something else (I dunno, SIGWINCH or whatever) to trigger the same thing in my earlier issue.
Still waiting for the stars to align to allow me to reproduce the issue without Ctrl+C with
-vvvvv.