[Nix#9640] segfault during substitution on x86-64_darwin #81

Closed
opened 2024-03-16 06:44:48 +00:00 by lix-bot · 1 comment
Member

Upstream-Issue: NixOS/nix#9640

I don't know if this is tractable from the Nix side, but I figure it deserves a report since some users are encountering it when they invoke nix commands.

Describe the bug

Since the stdenv bump to LLVM 16 in Nixpkgs, at least some intel mac users have started seeing segfaults when Nix tries to print the size of missing store paths before substituting.

Note

: If you're running into this, you can likely work around the crash by adding --option print-missing false.

Depending on shell used, these can manifest like:

$ nix-shell -p cowsay
...
Segmentation fault: 11

or:

$ nix-shell -p cowsay
...
[1]    43294 segmentation fault  nix-shell -p cowsay

When Nix is wrapped by something else like darwin-rebuild, these may also just be indicated by exit status (status 139 in one known case).

Reports so far indicate that this affects x86-64_darwin up to at least macOS 10.15.7. A similar segfault has been reported against Nixpkgs. I'm not 100% sure they share a root cause, but that report suggests this may affect up to macOS 11.x but not 12.x:

Steps To Reproduce

  1. Run an invocation that needs to realize something that could be substituted, such as nix-shell -p cowsay --dry-run

Additional context

Here's the most-relevant part of the crash dump (full copy: nix_2023-12-17-113156_b8793364.txt):

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libc++.1.0.dylib              	0x0000000108f1401e std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > std::__1::num_get<char, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > >::__do_get_unsigned<unsigned long>(std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::ios_base&, unsigned int&, unsigned long&) const + 46
1   libc++.1.dylib                	0x00007fff66f45f4b std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(float) + 247
2   libnixmain.dylib              	0x00000001086e832d void boost::io::detail::put<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&, boost::io::detail::format_item<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::string_type&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::internal_streambuf_t&, std::__1::locale*) + 781
3   libnixmain.dylib              	0x00000001086e7f5e void boost::io::detail::distribute<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&) + 190
4   libnixmain.dylib              	0x00000001086fcb37 void nix::formatHelper<boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >, float, float>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, float const&, float const&) + 87
5   libnixmain.dylib              	0x00000001086f3626 nix::printMissing(nix::ref<nix::Store>, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, unsigned long long, unsigned long long, nix::Verbosity) + 1350

The segfaults happen when printMissing tries to printMsg the float document/NAR sizes here:

7f5ed330e4/src/libmain/shared.cc (L67-L79)

We can work around the crash with --option print-missing false because Nix only ends up on this code path when print-missing is true (though it is the default):

7f5ed330e4/src/nix-store/nix-store.cc (L155-L156)

Note

: I don't recall exactly how the Nix bundled with installer gets assembled, but I suspect we'd already have reports against this repo if this was affecting new Nix installs. Since the flake is targeting inputs.nixpkgs.url = "github:NixOS/nixpkgs/staging-23.05";, I imagine the binaries installed by the installer still use llvm 11. If that's right, I imagine this is mostly just biting people who used something like nix-darwin, home-manager, or nix update with a fairly recent nixpkgs.

If that's right, this issue could affect more users and CI systems if the nixpkgs input is updated to a rev including the stdenv bump before we have a solution here or perhaps a patch/bump in nixpkgs?

Related posts/reports:

Priorities

Add 👍 to issues you find important.

Upstream-Issue: https://git.lix.systems/NixOS/nix/issues/9640 I don't know if this is tractable from the Nix side, but I figure it deserves a report since some users are encountering it when they invoke nix commands. **Describe the bug** Since the stdenv bump to LLVM 16 in Nixpkgs, at least some intel mac users have started seeing segfaults when Nix tries to print the size of missing store paths before substituting. > **Note**: If you're running into this, you can likely work around the crash by adding `--option print-missing false`. Depending on shell used, these can manifest like: ```console $ nix-shell -p cowsay ... Segmentation fault: 11 ``` or: ```console $ nix-shell -p cowsay ... [1] 43294 segmentation fault nix-shell -p cowsay ``` When Nix is wrapped by something else like `darwin-rebuild`, these may also just be indicated by exit status (status `139` in one known case). Reports so far indicate that this affects x86-64_darwin up to at least macOS 10.15.7. A similar segfault has been reported against Nixpkgs. I'm not 100% sure they share a root cause, but that report suggests this may affect up to macOS 11.x but not 12.x: - https://github.com/NixOS/nixpkgs/issues/269548 **Steps To Reproduce** 1. Run an invocation that needs to realize something that could be substituted, such as `nix-shell -p cowsay --dry-run` **Additional context** Here's the most-relevant part of the crash dump (full copy: [nix_2023-12-17-113156_b8793364.txt](https://github.com/NixOS/nix/files/13707188/nix_2023-12-17-113156_b8793364.txt)): ``` Thread 0 Crashed:: Dispatch queue: com.apple.main-thread 0 libc++.1.0.dylib 0x0000000108f1401e std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > std::__1::num_get<char, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> > >::__do_get_unsigned<unsigned long>(std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::istreambuf_iterator<char, std::__1::char_traits<char> >, std::__1::ios_base&, unsigned int&, unsigned long&) const + 46 1 libc++.1.dylib 0x00007fff66f45f4b std::__1::basic_ostream<char, std::__1::char_traits<char> >::operator<<(float) + 247 2 libnixmain.dylib 0x00000001086e832d void boost::io::detail::put<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&, boost::io::detail::format_item<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::string_type&, boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >::internal_streambuf_t&, std::__1::locale*) + 781 3 libnixmain.dylib 0x00000001086e7f5e void boost::io::detail::distribute<char, std::__1::char_traits<char>, std::__1::allocator<char>, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, boost::io::detail::put_holder<char, std::__1::char_traits<char> > const&) + 190 4 libnixmain.dylib 0x00000001086fcb37 void nix::formatHelper<boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >, float, float>(boost::basic_format<char, std::__1::char_traits<char>, std::__1::allocator<char> >&, float const&, float const&) + 87 5 libnixmain.dylib 0x00000001086f3626 nix::printMissing(nix::ref<nix::Store>, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, std::__1::set<nix::StorePath, std::__1::less<nix::StorePath>, std::__1::allocator<nix::StorePath> > const&, unsigned long long, unsigned long long, nix::Verbosity) + 1350 ``` The segfaults happen when `printMissing` tries to `printMsg` the float document/NAR sizes here: https://github.com/NixOS/nix/blob/7f5ed330e40d0aa2a2f907b2d4157329ff953cd2/src/libmain/shared.cc#L67-L79 We can work around the crash with `--option print-missing false` because Nix only ends up on this code path when `print-missing` is true (though it is the default): https://github.com/NixOS/nix/blob/7f5ed330e40d0aa2a2f907b2d4157329ff953cd2/src/nix-store/nix-store.cc#L155-L156 > **Note**: I don't recall exactly how the Nix bundled with installer gets assembled, but I suspect we'd already have reports against this repo if this was affecting new Nix installs. Since the flake is targeting `inputs.nixpkgs.url = "github:NixOS/nixpkgs/staging-23.05";`, I imagine the binaries installed by the installer still use llvm 11. If that's right, I imagine this is mostly just biting people who used something like `nix-darwin`, `home-manager`, or `nix update` with a fairly recent nixpkgs. > > If that's right, this issue could affect more users and CI systems if the nixpkgs input is updated to a rev including the stdenv bump before we have a solution here or perhaps a patch/bump in nixpkgs? Related posts/reports: - https://discourse.nixos.org/t/segmentation-fault-when-running-any-nix-program-sigsegv-exit-code-139/36659/9 **Priorities** Add :+1: to [issues you find important](https://github.com/NixOS/nix/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc).
lix-bot added the
bug
imported
labels 2024-03-16 06:44:48 +00:00
Owner

appears to have been an abi violation that is since fixed in nixpkgs?

appears to have been an abi violation that is since fixed in nixpkgs?
jade closed this issue 2024-05-30 18:26:15 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#81
No description provided.