Test failure of Functional2 in CI (and potentially elsewhere) #1017
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
awaiting
author
awaiting
contributors
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lix-project/lix#1017
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
Darwin builders seem to sometimes struggle with the timeout tests and throw a bad error message leading to test failure.
See Exhibit A Exhibit B
it is currently unknown if this behavior was present in F1 already and only got copied or if it was introduced by the migration to f2
Steps To Reproduce
Expected behavior
The CI should not fail. Either by fixing the underlying issue, or - if expected behavior - allowing the currently failing message
Additional context
See functional2 room on matrix
we would guess this is kill order dependent: during sandbox teardown we kill the process group, but we think it's possible that in a process tree
nix-daemon < sandbox parent < sandbox childthe child gets killed first and the parent propagates the exit code to the daemon before it too gets killed. if that's the case (which should be easy to verify for folks running darwin) there's nothing we can do about this and this is not an error. (we can't rely on signal exit being128 + nbecause that's a shell-specific convention)