Perf regressions around CL 1626: Expr returning Value & using new Value constructors #792
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
3 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#792
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
So I took a stab at trying to narrow down the perf regressions in CL 1626.. and the tldr is that while the hit to eval seems related to the topic of the CL (returning Value rather than using out-parameters), the hit to
nix search
seems to be more related to the move to newer Value constructors (somehow? I'm confused too)Since the latter is kinda orthogonal to the CL topic and relates to constructors already merged and in the codebase, I thought it makes sense to create an issue for this rather than continue the conversation in Gerrit.. and also since it seems more convenient for longer-form comments
Benchmark runs
I rebased CL 1626 against latest lix HEAD (@
2ef4b69
) and did some preliminary benchmark runs and looks at it.. and after a bit I realised we're kinda doing two things in the CL: move to return a Value, and at the same time also use some of the new constructors instead of the old mkFoo functions. Since we're trying to track down confusing perf behaviour, I thought it might not hurt to split the latter out into its own commit.I attached patches for both of these (the CL rebased and split into only the eval-returns-value part, and then the new constructors as a separate step atop that)
Here's the benchmark results (walltime + icount) for all three versions (baseline, return value, return value + new ctors), trying to run them on an as-idle-as-possible system.
...which looks a lot like the new-constructors change is responsible for (most of) the search perf hit?
the other notable thing to me is how the large-heap rebuild seems mostly unaffected, which I guess would seem to imply most of the hit to the rebuild eval stems from additional GC allocations? but I'm not really sure why this would be the case from the change to returning
Value
s on the stackAt this point I also build the different testsubjects with temporary files kept (clang
save-temps
) and tried to diff the generated assembly between returns-value and returns-value-new-ctors.. figuring such a large diff has to stand out. the only thing I noticed is the diffs forExprList
andExprConcatStrings
were quite involved; the other functions all seemed to have mostly-trivial differences. building a version with those two functions reverted still shows same performance asreturns-value-new-ctors
though, so no dice there eitherAt this point I dunno what to try next, but I thought the split CL and perf difference between them was curious enough, and maybe it prompts some insights for someone with more C++/lix codebase familiarity?
-Wno-deprecated-declarations
#744This issue was mentioned on Gerrit on the following CLs:
that this only affects search and rebuild means it's a gc collection symptom, not extra allocations. which also means that it's caused by differences in memory contents caused by the code changes. so we went through the new constructors one by one and checked how they affect memory contents, which turned up that the modern boolean constructor does not fully clear its first data word (it only writes the low byte, the others remain indeterminate). this in turn means that whatever was previously in the (mostly stack) memory that we're not fully overwriting remains a valid pointer in many cases, which causes the gc to collect less, which causes the gc to run more, which wrecks performance.
something similar happens when returning values instead of using out references, but that's not something we can mitigate as easily—nor should we, possibly, since the new gc we've been thinking about would work better with the current out-ref pattern. we can probably make the constructor usage change and drop all the
mk${type}
methods on Value, but the return type changes should be shelved for now.oh! that makes a lot of sense now that I'm reading your explanation of the problem.. but I'm slightly annoyed at myself for not noticing that when looking at the assembly diffs yesterday (also sorry for the somewhat-more-verbose-than-I'd-intended report, heh)
will you submit a CL with that diff?
thanks for the heads-up re the return-type changes--in that case I guess we'd keep the default constructors for now in
Value v; eval(..., &v);
style casesprobably not, feel free to take it and make a ctor-only cl (and maybe even remove the
mkX
methods, if possible)oki, I'll see what I can cook up then, and continue on some half-baked changes I stashed