Build hangs on macos/darwin #966
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#966
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
Just doing a regular
nix flake update
followed by adarwin-rebuild switch --flake ~/.config/nix-darwin --max-jobs 10
The build hangs on the last install check
it's been like that for a few hours now. Cancelling and rebuilding doesn't seem to fix
here is the diff of my flake.lock if it helps
Steps To Reproduce
1f47ecef4ef5f67b34653b9c61d6d99b2720eb44
Expected behavior
Build finished like normal and switches to new version
nix --version
outputcould you share your
/etc/nix/nix.conf
? cc @emilazyyeah here you go
and here is my nix-darwin configuration file from the flake
Alright, no sandboxing so… our CI should catch your hangout… Urgh.
@imran-iq could you perhaps provide a build with
-L
where we can see everything? thanks!Sure thing, I am attaching as a file as the output is long, but basically at that last test it seems to just wait forever
Did a bisect to find where the build starts hanging for me and its this:
start:
c3bfb6fe17
end:
1f47ecef4e
Urgh, I will send a revert then I think and ping the developer for it.
This issue was mentioned on Gerrit on the following CLs:
makeTemp{,Sibling}Path
callers"")makeTemp{,Sibling}Path
more"")makeTempPath
increateTempSubdir
"")createTempDir
interface"")makeTempSiblingPath
helper"")makeTempSiblingPath
inreplaceValidPath
"")Hash
"")What's funny though is I can build past it eg:
001c70d2ba
so maybe non issue, just dont try to upgrade to the previously mentioned sha
@imran-iq wrote in #966 (comment):
It clearly doesn't spark joy, I wonder if there's a fix somewhere after that commit SHA1
Ok so the plot thickens.
With
001c70d2ba
I am unable to run the following (followed by a darwin-rebuild build, how I was bisecting )how ever if I rollback to an earlier version I can
so something is up this week with macos and tmp dirs
so:
1f47ecef4e
hangs on the final testd8b1fb7799
(next commit) does not hang on the test, i have no idea whybut anything newer than
1f47ecef4e
causes stuff likenix flake update
to fail withCannot extract through symlink
looks like
/tmp
is a symlinkwhich reminds me that there was some create volume thing needed to make /nix work on macos, maybe something changed in macos 15.5 (not an apple person, this is a work computer).
so maybe might need that revert after all? (i.e cant extract to /tmp cuz of symlink issues)
Thanks for the debugging, I will try to take a look tomorrow and see how we can push the needle. If you can try to revert the
1f47ecef4
and run with it, I believe you will have the same "cannot extract through symlink" though which might be related to another series pertaining to macOS fixes.cc @emilazy
@imran-iq A revert was applied, can you tell me if you still encounter weird issues?
I can confirm that I am not getting the
Cannot extract through symlink
error with the revert (ie everything seems to be working afaict)Closing for the time being then, thank you! Please reopen if you notice anything wrong.
As requested I am re-opening.
Even with newer builds the test will sometimes hang (seems to be a race condition). I ran
ps -ef
as the test hangs and that leads me here:which then led me to this test:
chmod 0755 "$BUILD_DIR"
FIFO="$BUILD_DIR/fifo"
mkfifo "$FIFO"
(
echo > "$FIFO"
trap 'echo > "$FIFO"' EXIT
mode=$(stat -c %a $BUILD_DIR/b/*)
[ "$mode" = "700" -o "$mode" = "710" ]
) &
nix build --build-dir "$BUILD_DIR/b" -E '
with import ./config.nix; mkDerivation {
name = "test";
buildCommand = "cat '"$FIFO"'; cat '"$FIFO"' > $out";
}' \
--extra-sandbox-paths "$FIFO" \
--impure \
--no-link
wait
I am not too familiar with fifos/bash, but judging from the output of ps, the test seems to be hanging on the second cat.
My theory (assuming cat consumes everything on the fifo) is that the subprocess finishes before the nix build gets a chance to run, so the first cat ends up consuming everything leaving nothing for the second cat, and hence the build hang.
well the fifo is also gone from the build directory:
so (i think) the only way for me to unstuck the build is to kill the cat process and fail the build