Lix is hitting fetcher-cache-v1.sqlite too hard under mass concurrency #1122

Open
opened 2026-02-06 17:01:08 +00:00 by raito · 1 comment
Owner

Describe the bug

When Lix is fetching 100s of copies of the same Flake input, the SQLite cache (fetcher-cache-v1.sqlite) is put under massive pressure and can throw errors in various places of the fetching code. This lead to fatal failure when it could be retried or paused gracefully.

Steps To Reproduce

  1. Go to the Lix repo.
  2. Empty or block out your ~/.cache/nix/fetcher-cache-v1.sqlite (I cleared mine)
  3. nix-shell -p lixPackageSets.latest.nix-eval-jobs
  4. nix-eval-jobs --gc-roots-dir /tmp/somewhere/gcroots --force-recurse --max-memory-size 4096 --workers 96 --flake "git+file://$(pwd)?rev=56988d860593a5fd8153d02a0ca5469508378626#hydraJobs"

The exact number of workers is not a rocket science, you need enough concurrency but just below the nr that cause the daemon to reject your connections. 96 on my AMD Ryzen 9 7900X 12-Core Processor cause it to occur.

Expected behavior

Retries or self-pacing.

nix --version output

Reported to occur on 2.94.0 by @lheckemann
Reproduced using nix-eval-jobs from 2.94.0, the code that runs the Flake fetching is independent from the daemon (I believe?), so 2.94.0.

Additional context

I believe that the error occur exactly here:

        if (!shallow)
            infoAttrs.insert_or_assign(
                "revCount",
                std::stoull(TRY_AWAIT(runProgram(
                    "git",
                    true,
                    {"-C",
                     repoDir,
                     "--git-dir",
                     gitDir,
                     "rev-list",
                     "--count",
                     input.getRev()->gitRev()}
                )))
            );

        if (!_input.getRev())
            getCache()->add(
                store,
                unlockedAttrs,
                infoAttrs,
                storePath,
                false);

// here |
//           v
        getCache()->add(
            store,
            getLockedAttrs(),
            infoAttrs,
            storePath,
            true);
## Describe the bug When Lix is fetching 100s of copies of the same Flake input, the SQLite cache (`fetcher-cache-v1.sqlite`) is put under massive pressure and can throw errors in various places of the fetching code. This lead to fatal failure when it could be retried or paused gracefully. ## Steps To Reproduce 1. Go to the Lix repo. 2. Empty or block out your ~/.cache/nix/fetcher-cache-v1.sqlite (I cleared mine) 3. `nix-shell -p lixPackageSets.latest.nix-eval-jobs` 4. `nix-eval-jobs --gc-roots-dir /tmp/somewhere/gcroots --force-recurse --max-memory-size 4096 --workers 96 --flake "git+file://$(pwd)?rev=56988d860593a5fd8153d02a0ca5469508378626#hydraJobs"` The exact number of workers is not a rocket science, you need enough concurrency but just below the nr that cause the daemon to reject your connections. 96 on my AMD Ryzen 9 7900X 12-Core Processor cause it to occur. ## Expected behavior Retries or self-pacing. ## `nix --version` output Reported to occur on 2.94.0 by @lheckemann Reproduced using nix-eval-jobs from 2.94.0, the code that runs the Flake fetching is independent from the daemon (I believe?), so 2.94.0. ## Additional context I believe that the error occur exactly here: ```cpp if (!shallow) infoAttrs.insert_or_assign( "revCount", std::stoull(TRY_AWAIT(runProgram( "git", true, {"-C", repoDir, "--git-dir", gitDir, "rev-list", "--count", input.getRev()->gitRev()} ))) ); if (!_input.getRev()) getCache()->add( store, unlockedAttrs, infoAttrs, storePath, false); // here | // v getCache()->add( store, getLockedAttrs(), infoAttrs, storePath, true); ```
Member

This issue was mentioned on Gerrit on the following CLs:

  • commit message in cl/5081 ("libfetchers/cache: retry inserting entries into the fetcher cache")
<!-- GERRIT_LINKBOT: {"cls": [{"backlink": "https://gerrit.lix.systems/c/lix/+/5081", "number": 5081, "kind": "commit message"}], "cl_meta": {"5081": {"change_title": "libfetchers/cache: retry inserting entries into the fetcher cache"}}} --> This issue was mentioned on Gerrit on the following CLs: * commit message in [cl/5081](https://gerrit.lix.systems/c/lix/+/5081) ("libfetchers/cache: retry inserting entries into the fetcher cache")
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lix-project/lix#1122
No description provided.