Garbage collection of "too many" paths is unsound #524
Labels
No labels
Area/build-packaging
Area/cli
Area/evaluator
Area/fetching
Area/flakes
Area/language
Area/profiles
Area/protocol
Area/releng
Area/remote-builds
Area/repl
Area/store
bug
crash 💥
Cross Compilation
devx
docs
Downstream Dependents
E/easy
E/hard
E/help wanted
E/reproducible
E/requires rearchitecture
imported
Needs Langver
OS/Linux
OS/macOS
performance
regression
release-blocker
RFD
stability
Status
blocked
Status
invalid
Status
postponed
Status
wontfix
testing
testing/flakey
ux
No milestone
No project
No assignees
2 participants
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference: lix-project/lix#524
Loading…
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Describe the bug
TL;DR: Garbage collection of large Nix stores can leave the Nix store in an inconsistent state, e.g. referrers without ValidPath rows.
Motivated by https://gerrit.lix.systems/c/lix/+/1916. I have found a curious case of corruption, inconsistencies, idk?
Steps To Reproduce
Way 1: Consult the test in the CL, emit a
DELETE
to a referenced valid path that bypasses the foreign key constraints (can be done byPRAGMA foreign_keys = OFF;
in the session)Way 2: To discover yet.
Expected behavior
(1) GC should always succeed.
(2) GC/deletion/etc. should never leave the DB in the wrong state.
nix --version
outputHappened to me for every version of Nix ≥ 2.18, including Lix ≥ 2.90.
Additional context
The diagnostic may be wrong.
#505
maybe also part of the observed behaviour
Actually, my root cause analysis seems wrong, FWIW, I expanded why in the Gerrit CL commit message, I will update the initial description.
Garbage collection of long-chained derivations is unsoundto Garbage collection of "too many" paths is unsoundI believe that option 1 ought to be impossible, since some trivial auditing reveals that we only call sqlite3_* functions (besides setting some pragmas or pure reads) inside of sqlite.cc inside of some kind of wrapper class that should have set the appropriate foreign keys pragma.
This leads me to wonder: races, somehow? It shouldn't be so, though.
@jade Yeah, this is what I wrote in the commit message of the CL as well. SQLite is transactional, though, so races are all serialized in front of the consistency in principle. You cannot add back a (referrer, reference) pair if the referrer does not exist. You cannot delete a valid path's referrer if there's still (referrer, reference) pairs in Refs table, etc.