Overflow in Nix language arithmetic does not have defined semantics #423

Closed
opened 2024-06-26 04:48:33 +00:00 by jade · 8 comments
Owner

... and literally invokes C++ undefined behaviour by signed overflow, which we accidentally found out about because we trap on signed overflow in production.

$ nix eval --expr '500000000*50000000000'
fish: Job 1, 'nix eval --expr '500000000*5000…' terminated by signal SIGTRAP (Trace or breakpoint trap)

thanks to V for reporting this!

We should define what the semantics are for this. It could be specified as wrapping, but I am not sure if that is actually a good idea: maybe it should trap at the language level? Certainly the manual does not specify the semantics of this.

Pinging @roberth as another implementation maintainer. As far as I know this bug has first been reported today, which means that we have successfully evaluated a very large amount of production Nix code over the past three months with the semantics semi-inadvertently made to be to trap on overflow.

Personally my inclination is to define the language level semantics to be to to throw an evaluation error on overflow, since the domain of the language is so high level that wrapping on overflow would conceal bugs.

... and literally invokes C++ undefined behaviour by signed overflow, which we accidentally found out about because we trap on signed overflow in production. ``` $ nix eval --expr '500000000*50000000000' fish: Job 1, 'nix eval --expr '500000000*5000…' terminated by signal SIGTRAP (Trace or breakpoint trap) ``` thanks to V for reporting this! We should define what the semantics are for this. It *could* be specified as wrapping, but I am not sure if that is actually a good idea: maybe it should trap at the language level? Certainly the manual does not specify the semantics of this. Pinging @roberth as another implementation maintainer. As far as I know this bug has first been reported today, which means that we have successfully evaluated a very large amount of production Nix code over the past three months with the semantics semi-inadvertently made to be to trap on overflow. *Personally* my inclination is to define the language level semantics to be to to throw an evaluation error on overflow, since the domain of the language is so high level that wrapping on overflow would conceal bugs.
jade added the
bug
Area/evaluator
labels 2024-06-26 04:48:33 +00:00

I can't exclude the possibility that some users rely on this behavior, as we've allowed the signed integer to overflow for probably around two decades (didn't check when * was implemented).
While its behavior may be undefined, it is a predictable behavior on popular architectures, and it wouldn't be wrong to standardize on overflowing in a way that isn't undefined C++ behavior.

That said, I agree that considering the domain the expression language is for, it'd be better to throw.
If it turns out that someone is doing some fancy (cryptographic?) computation that relies on this, we may consider to allow overflow in a controlled way, or provide an alternative set of builtins for the purpose of overflowing arithmetic.

In the latter case, at risk of getting ahead of ourselves, I'd be curious what you think of Nix's undocumented "DSL capabilities":

let 
  overflowArithmetic.__mul = f;
  # or really overflowArithmetic = builtins.overflowArithmetic;

  inherit ({ let inherit (overflowArithmetic) __mul;
in 
  assert 500000000 * 50000000000 == 6553255926290448384;
  "wdyt?"
I can't exclude the possibility that some users rely on this behavior, as we've allowed the signed integer to overflow for probably around two decades (didn't check when `*` was implemented). While its behavior may be undefined, it is a predictable behavior on popular architectures, and it wouldn't be wrong to standardize on overflowing in a way that isn't undefined C++ behavior. That said, I agree that considering the domain the expression language is for, it'd be better to throw. If it turns out that someone is doing some fancy (cryptographic?) computation that relies on this, we may consider to allow overflow in a controlled way, or provide an alternative set of builtins for the purpose of overflowing arithmetic. In the latter case, at risk of getting ahead of ourselves, I'd be curious what you think of Nix's undocumented "DSL capabilities": ```nix let overflowArithmetic.__mul = f; # or really overflowArithmetic = builtins.overflowArithmetic; inherit ({ let inherit (overflowArithmetic) __mul; in assert 500000000 * 50000000000 == 6553255926290448384; "wdyt?" ```
Owner

I'd be curious what you think of Nix's undocumented "DSL capabilities":

bad. evil. bug.

> I'd be curious what you think of Nix's undocumented "DSL capabilities": bad. evil. bug.
Author
Owner

My inclination is to define it to throw anyway since it is turning rare dubious code into an unambiguous error, and potentially even linking to a tracking issue for the change for feedback. I could be persuaded to implement it with wrapping semantics and change it with a language version also, if we think we need to be more conservative. @pennae: thoughts?

If someone complains of actually needing it, we can add builtins.wrapping.{add,sub,mul,div} or so. But I don't, myself, necessarily want to add them immediately because I don't really like speculatively adding language features.

Another thought is to name those something like "builtins.math.wrappingAdd", which leaves room for also putting wrappingLeftShift and other things people probably want for doing evil things in Nix language. Maybe there's a strong use case for bitwise operators in building things to put in /proc somewhere, but there's many ways to implement that that don't require more language features.

But I think the proposal to make bitwise operations should be considered separately from fixing overflow semantics.

As for the desugaring of those operators, I know about it, it's in the Nix iceberg and it should be removed in a future language version.

My inclination is to define it to throw anyway since it is turning rare dubious code into an unambiguous error, and potentially even linking to a tracking issue for the change for feedback. I could be persuaded to implement it with wrapping semantics and change it with a language version also, if we think we need to be more conservative. @pennae: thoughts? If someone complains of actually needing it, we can add builtins.wrapping.{add,sub,mul,div} or so. But I don't, myself, necessarily want to add them immediately because I don't really like speculatively adding language features. Another thought is to name those something like "builtins.math.wrappingAdd", which leaves room for also putting `wrappingLeftShift` and other things people probably want for doing evil things in Nix language. Maybe there's a strong use case for bitwise operators in building things to put in /proc somewhere, but there's many ways to implement that that don't require more language features. But I think the proposal to make bitwise operations should be considered separately from fixing overflow semantics. As for the desugaring of those operators, I know about it, it's in the Nix iceberg and it should be removed in a future language version.
Owner

My inclination is to define it to throw anyway since it is turning rare dubious code into an unambiguous error, and potentially even linking to a tracking issue for the change for feedback. I could be persuaded to implement it with wrapping semantics and change it with a language version also, if we think we need to be more conservative.

we'd prefer to define it as signed twos-complement overflow in legacy nixlang and define it to throw in a future language version, just to be sure we don't bite anyone. it is however truly cursed to rely on this, and we can't imagine anyone actually doing so, so it's probably fine to just have it throw—and then, if folks do turn out to rely on this, reconsider or add wrapping builtins? (though it's cursed enough that we may as well provide wrapping arithmetic as a contrib plugin instead, if we wanted to go that route.)

> My inclination is to define it to throw anyway since it is turning rare dubious code into an unambiguous error, and potentially even linking to a tracking issue for the change for feedback. I could be persuaded to implement it with wrapping semantics and change it with a language version also, if we think we need to be more conservative. we'd prefer to define it as signed twos-complement overflow in legacy nixlang and define it to throw in a future language version, just to be sure we don't bite anyone. it is however truly cursed to rely on this, and we can't imagine anyone actually *doing* so, so it's probably fine to just have it throw—and *then*, if folks do turn out to rely on this, reconsider or add wrapping builtins? (though it's cursed enough that we may as well provide wrapping arithmetic as a contrib plugin instead, if we wanted to go that route.)
Author
Owner

In the interest of not assuming use cases exist that aren't real, I'll write a patch in the next few days to make it throw and document it as such, and we can take responsibility for it if we break anyone's cursed code. I can also port the patch to CppNix. Especially given that every public version of Lix to date has had these semantics but as a trap, I'm expecting this will be a safe change.

This work for everyone?

In the interest of not assuming use cases exist that aren't real, I'll write a patch in the next few days to make it throw and document it as such, and we can take responsibility for it if we break anyone's cursed code. I can also port the patch to CppNix. Especially given that every public version of Lix to date has had these semantics but as a trap, I'm expecting this will be a safe change. This work for everyone?

This work for everyone

👍
I've brought this up with the Nix team and we agree with the strategy to start throwing on overflow, and learn if we need a new behavior. @jade your offer to patch CppNix is much appreciated.

> This work for everyone 👍 I've brought this up with the Nix team and we agree with the strategy to start throwing on overflow, and learn if we need a new behavior. @jade your offer to patch CppNix is much appreciated.

Not a strongly informed opinion, but: they key piece of missing information here is that we don't know if anyone in the wild relies on it. I'd put that at 10% perhaps? This is truly cursed, but, if it indeed was there for decades, there might be pretty cursed things.

So I'd suggest for the first lix release to keep it officially undefined, and stick to the current semantics, just to minimize the chance that the first release suffers incompatibility issues.

In the next release, just change it to throwing, but be ready to pedal back and proceed with a more careful version-gated rollout, if it turns out that some important code does depend on this semantics.

That is, first lix release seems quiet special, and it might makes sens to hold off optional behavioral changes just out of abundance of caution.

Not a strongly informed opinion, but: they key piece of missing information here is that we don't know if anyone in the wild relies on it. I'd put that at 10% perhaps? This is truly cursed, but, if it indeed was there for decades, there might be pretty cursed things. So I'd suggest for the first lix release to keep it officially undefined, and stick to the current semantics, just to minimize the chance that the first release suffers incompatibility issues. In the _next_ release, just change it to throwing, but be ready to pedal back and proceed with a more careful version-gated rollout, if it turns out that some important code does depend on this semantics. That is, first lix release seems quiet special, and it might makes sens to hold off optional behavioral changes just out of abundance of caution.
Member

This issue was mentioned on Gerrit on the following CLs:

  • commit message in cl/1596 ("language: cleanly ban integer overflows")
<!-- GERRIT_LINKBOT: {"cls": [{"backlink": "https://gerrit.lix.systems/c/lix/+/1596", "number": 1596, "kind": "commit message"}], "cl_meta": {"1596": {"change_title": "language: cleanly ban integer overflows"}}} --> This issue was mentioned on Gerrit on the following CLs: * commit message in [cl/1596](https://gerrit.lix.systems/c/lix/+/1596) ("language: cleanly ban integer overflows")
Sign in to join this conversation.
No milestone
No project
No assignees
5 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#423
No description provided.