Investigate QoS classes on Darwin #717
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#717
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
emilazy pointed out to me that we might have a similar mistake as bazel used to have on macOS, caused by incorrect QoS classes at process startup time. https://jmmv.dev/2019/03/macos-threads-qos-and-bazel.html
I have not looked into this, but I can guarantee that we have code written that does not consider this today.
Note that this would probably have to go with a project to remove the fork()s from the Lix daemon, but, thankfully, we want that anyway for other reasons.
Link for further debugging: https://developer.apple.com/library/archive/documentation/Performance/Conceptual/power_efficiency_guidelines_osx/PrioritizeWorkAtTheTaskLevel.html#//apple_ref/doc/uid/TP40013929-CH35-SW10
Update on looking at this: the Lix daemon is running at Utility QoS class (I think), as are its child processes. It's not obvious to me what the correct QoS class is for "rendering a video" or other batch jobs kicked off by a user that they're actively watching (or not).
Maybe we could set the Lix daemon to a higher class than its children (e.g. User-Initiated) and that might make stuff go faster. But I am not sure what is being starved/slow and by whom.
QoS class utility (launchd, default)
QoS class default (jade "ran it in her terminal lol"):
that's 1 minute faster lix derivation build by setting a higher QoS class. Note that it seems that this affects the test suite the most, which is probably because of delays getting piles of child processes onto run queues or something? I dunno.
Either way there's hella smoke here and someone should go looking for the fire (can we just set the daemon to Default and not children? Is it a logging problem (#935)?).