Investigate why LTO causes test failures on x86_64-darwin (under Rosetta 2) #568

Open
opened 2024-11-04 03:58:24 +00:00 by lilyball · 1 comment
Member

Describe the bug

Nixpkgs enables LTO for its Lix package definition (when not building statically). This causes test failures when building for x86_64-darwin (under Rosetta 2, unsure if that matters). The Lix flake does not enable LTO, which is why we haven't seen the problem ourselves.

For comparison, the Nixpkgs Nix package enables LTO when building with GCC (and not static). No explanation was given, though I found a comment suggesting that it was simply known that LTO didn't work on darwin. @emilazy suggested that what's going on is this is due to UB and GCC is simply exposing less UB than Clang. We don't know for a fact this is UB though.

If this is UB it would be good to fix. If this is not UB it would still be good to understand what's going on because enabling LTO is a potentially useful thing to do.

Steps To Reproduce

I reproduced the problem in Nixpkgs with nix build .#legacyPackages.x86_64-darwin.lix (on an aarch64-darwin machine with Rosetta 2), and confirmed that disabling LTO fixed the test failures. When doing this the failures I got were bizarre and looked like

../tests/unit/libexpr/value/context.cc:16: Failure
Expected: NixStringContextElem::parse("") throws an exception of type BadNixStringContextElem.
  Actual: it throws nix::BadNixStringContextElem with description "error: Bad String Context element: String context element should never be an empty string: ".

[  FAILED  ] NixStringContextElemTest.empty_invalid (0 ms)

(there are other test failures but they're all similar issues, where the expected and actual exceptions have the same names and yet the test considered them to be distinct)

## Describe the bug Nixpkgs enables LTO for its Lix package definition (when not building statically). This causes test failures when building for x86_64-darwin (under Rosetta 2, unsure if that matters). The Lix flake does not enable LTO, which is why we haven't seen the problem ourselves. For comparison, the Nixpkgs Nix package enables LTO when building with GCC (and not static). No explanation was given, though I found a [comment](https://github.com/NixOS/nixpkgs/pull/181180#issuecomment-1184882662) suggesting that it was simply known that LTO didn't work on darwin. @emilazy suggested that what's going on is this is due to UB and GCC is simply exposing less UB than Clang. We don't know for a fact this is UB though. If this is UB it would be good to fix. If this is not UB it would still be good to understand what's going on because enabling LTO is a potentially useful thing to do. ## Steps To Reproduce I reproduced the problem in Nixpkgs with `nix build .#legacyPackages.x86_64-darwin.lix` (on an aarch64-darwin machine with Rosetta 2), and confirmed that disabling LTO fixed the test failures. When doing this the failures I got were bizarre and looked like ``` ../tests/unit/libexpr/value/context.cc:16: Failure Expected: NixStringContextElem::parse("") throws an exception of type BadNixStringContextElem. Actual: it throws nix::BadNixStringContextElem with description "error: Bad String Context element: String context element should never be an empty string: ". [ FAILED ] NixStringContextElemTest.empty_invalid (0 ms) ``` (there are other test failures but they're all similar issues, where the expected and actual exceptions have the same names and yet the test considered them to be distinct)
lilyball added the
bug
label 2024-11-04 03:58:24 +00:00
Owner

Can we get our hands on coredumps when such exceptions are thrown so we can identify what are the miscompilations/UB, etc. ?

Can we get our hands on coredumps when such exceptions are thrown so we can identify what are the miscompilations/UB, etc. ?
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#568
No description provided.