Better assignment of jobs to builders than required/supported features #604
Labels
No labels
Affects/CppNix
Affects/Nightly
Affects/Only nightly
Affects/Stable
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/lix ci
Area/nix-eval-jobs
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/repl/debugger
Area/store
bug
Context
contributors
Context
drive-by
Context
maintainers
Context
RFD
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Language/Bash
Language/C++
Language/NixLang
Language/Python
Language/Rust
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
Topic/Large Scale Installations
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#604
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Is your feature request related to a problem? Please describe.
I have multiple remote builders configured.
Routing jobs to a specific builder is currently very difficult.
It's in theory possible with supported and required features but these are in practice unusable because changing the features a package requires will change its hash and make it unbuildable by default for others who don't have your custom features for routing set up.
Describe the solution you'd like
I'd like to be able to route jobs based on more dynamic things like available memory, disk space and load.
I'd like to be able to change job routing without impacting the package hash, with appropriate guarantees that this isn't impacting inside the derivation/visible as env vars in it. I don't know how this would work.
Context
I'm working on packaging rocm 6.3 and multiple packages in it have multi-hour builds, require over 8GB per core given to the build, or have conditions like needing to spew 200GB of assembly into /build which make it easy to have a package run on a builder that is guaranteed to fail an hour in.
Before working on this project I occasionally ran into issues with build routing but it was usually just slightly frustrating.
There are two problems here in the same question.
(1) A better scheduler for Nix remote builds: we are fully with you on this, and we desire the same, it requires architectural changes which are developed by pennae so that we can someday attain this goal.
(2) Adding to the hash derivation modulo list the
*SystemFeatures
fields and this brings a question of: "can system features cause major differences in the output?"The answer is yes in practice due to CPU microarchitectural differences causing to derivations which are badly written impurities. This can result in worse issues if you say that a derivation can cause many different binary outputs because you route it to a Zen 1, Zen 2 and Zen 3 architecture and that you are passing -march=native. Nix cannot know what you are doing with your compiler flags IMHO.
It feels like to me that (2) would be a hack and (1) would be the real solution to the problem.
This happens without adding routing features when something has march=native and was never tested except with local builds and then someone with builders configured uses it.
Wonder if there's a not too inelegant way to add an extension point that'd allow hacky solutions to routing in the short term without causing maintenance burden.
sandbox
setting is a horrifying field of landmines of implementation complexity #936