Investigate why LTO causes test failures on x86_64-darwin (under Rosetta 2) #568
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
awaiting
author
awaiting
contributors
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
Feature/S3
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
lix-project/lix#568
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
Nixpkgs enables LTO for its Lix package definition (when not building statically). This causes test failures when building for x86_64-darwin (under Rosetta 2, unsure if that matters). The Lix flake does not enable LTO, which is why we haven't seen the problem ourselves.
For comparison, the Nixpkgs Nix package enables LTO when building with GCC (and not static). No explanation was given, though I found a comment suggesting that it was simply known that LTO didn't work on darwin. @emilazy suggested that what's going on is this is due to UB and GCC is simply exposing less UB than Clang. We don't know for a fact this is UB though.
If this is UB it would be good to fix. If this is not UB it would still be good to understand what's going on because enabling LTO is a potentially useful thing to do.
Steps To Reproduce
I reproduced the problem in Nixpkgs with
nix build .#legacyPackages.x86_64-darwin.lix(on an aarch64-darwin machine with Rosetta 2), and confirmed that disabling LTO fixed the test failures. When doing this the failures I got were bizarre and looked like(there are other test failures but they're all similar issues, where the expected and actual exceptions have the same names and yet the test considered them to be distinct)
Can we get our hands on coredumps when such exceptions are thrown so we can identify what are the miscompilations/UB, etc. ?
functional-test-libstoreconsumerfails with LTO on aarch64-darwin #832is this still a thing or has this been fixed by-chance (or migration)?
Support for Rosetta 2 is planned to be discontinued in the next years, macOS for Intel is also planned to be discontinued.
With our current resources, it's implausible we can support x86_64-darwin issues caused by compiler problems.
Unless there's a strong rationale for why we should tackle this, I will close this as this problem will disappear by itself.