lix-project/lix

Fork 31

Attrset merges are spine-strict when they don't need to be #518

New issue

Open

opened 2024-09-12 21:52:22 +00:00 by rbt · 6 comments

rbt commented

2024-09-12 21:52:22 +00:00

Owner

When evaluating (x // { puppy = true; }) ? puppy, we should not need to evaluate the spine (attrset keys) of x. Evaluating the spine can have a significant memory and runtime cost in the case of overlays or large package sets like Nixpkgs.

Perhaps this should not throw:

nix-repl> ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy
error:
       … in the left operand of the update (//) operator
         at «string»:1:41:
            1| ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy
             |                                         ^

       … caused by explicit throw
         at «string»:1:6:
            1| ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy
             |      ^

       error: uh oh!

When evaluating `(x // { puppy = true; }) ? puppy`, we should not need to evaluate the spine (attrset keys) of `x`. Evaluating the spine can have a significant memory and runtime cost in the case of overlays or large package sets like Nixpkgs. Perhaps this should not throw: ``` nix-repl> ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy error: … in the left operand of the update (//) operator at «string»:1:41: 1| ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy | ^ … caused by explicit throw at «string»:1:6: 1| ({ ${builtins.throw "uh oh!"} = true; } // { puppy = true; }) ? puppy | ^ error: uh oh! ```

rbt added the

performance

Area/evaluator

labels 2024-09-12 21:52:22 +00:00

pennae commented

2024-09-12 22:01:23 +00:00

Owner

the performance problem doesn't stem from spine-strictness but from attrset representation in memory being pretty basic. if we could represent attrs not as sorted arrays but as merge trees this problem would go away (and dynamic attrs are evil anyway and should be deleted from the language)

we do have to evaluate attrnames in both operands in all cases because key duplication is an error. delaying this error by making keys lazy would make ux very bad. there have been experiments with this in cppnix (lazy attrs i think it was called), but we're not at all convinced this is sound (let alone a good idea) to not do eagerly. the performance problem doesn't stem from spine-strictness but from attrset representation in memory being pretty basic. if we could represent attrs not as sorted arrays but as merge trees this problem would go away (and dynamic attrs are evil anyway and should be deleted from the language)

rbt commented

2024-09-13 00:27:29 +00:00

Author

Owner

if we could represent attrs not as sorted arrays but as merge trees this problem would go away

Possibly? It could also cause problems by adding a lot of indirection, depending on how many merges there are in a value and if there's a heuristic for repacking them. I'd love to run an experiment on this. Are we any closer to having performance testing infrastructure?

> if we could represent attrs not as sorted arrays but as merge trees this problem would go away Possibly? It could also cause problems by adding a lot of indirection, depending on how many merges there are in a value and if there's a heuristic for repacking them. I'd love to run an experiment on this. Are we any closer to having performance testing infrastructure?

raito commented

2024-09-14 18:25:29 +00:00

Owner

@rbt I'm sorry I was not able to get performance testing infrastructure yet, if someone could help on this area, that'd be great.

jade commented

2024-09-18 23:35:09 +00:00

Owner

we have a kind of shitty benchmark in-tree but it exists.

lunnova commented

2024-10-14 15:30:35 +00:00

dynamic attrs are evil anyway and should be deleted from the language

Is there a list of things which rely on this and would break if it were dropped?
In my projects I only tend to end up needing them when flattening something to match flake output schema requiring a single attrset instead of nested.

> dynamic attrs are evil anyway and should be deleted from the language Is there a list of things which rely on this and would break if it were dropped? In my projects I only tend to end up needing them when flattening something to match flake output schema requiring a single attrset instead of nested.

pennae commented

2024-10-14 21:50:12 +00:00

Owner

dynamic attrs are evil anyway and should be deleted from the language

Is there a list of things which rely on this and would break if it were dropped?
In my projects I only tend to end up needing them when flattening something to match flake output schema requiring a single attrset instead of nested.

we can immediately think of nixos options docs and numtide/flake-utils, so there's a sizeable base to consider. but a new language version will break those anyway, so it's not a huge hurdle to overcome (especially since all of these uses can be replaced with something more principled, like a new builtin str -> any -> set that creates a set from a key and a value)

> > dynamic attrs are evil anyway and should be deleted from the language > > Is there a list of things which rely on this and would break if it were dropped? > In my projects I only tend to end up needing them when flattening something to match flake output schema requiring a single attrset instead of nested. we can immediately think of nixos options docs and numtide/flake-utils, so there's a sizeable base to consider. but a new language version will break those *anyway*, so it's not a huge hurdle to overcome (especially since all of these uses can be replaced with something more principled, like a new builtin `str -> any -> set` that creates a set from a key and a value)

No milestone

No project

No assignees

5 participants

Notifications

Due date

The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#518

No description provided.