Concurrent store operations should be rate-limited #1022

Closed
opened 2025-10-31 10:53:20 +00:00 by arianvp · 10 comments
Member

Describe the bug

Copying to a local path store doesn't work

Steps To Reproduce

$ nix copy --to ./drvs --derivation nixpkgs#lixStatic
error: creating file '/home/arian/Projects/attestable-nixpkgs/nix-build/drvs/nix/store/p4m9shlpsqrvir464livcdni0y32ah3f-systemwide-man-db-conf.patch': Too many open files
$ nix copy --to file:///$PWD/drvs --derivation nixpkgs#lixStatic
error: opening file '/nix/store/7mbwxq1x96m6jlqy2z3a6nq1v6whzm4m-types_psutil-7.0.0.20250801.tar.gz.drv': Too many open files

Expected behavior

nix copy just works

nix --version output

$ nix --version
nix (Lix, like Nix) 2.93.3
System type: aarch64-linux
Additional system types: 
Features: gc, signed-caches
System configuration file: /etc/nix/nix.conf
User configuration files: /home/arian/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/arian/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/arian/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/arian/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf
Store directory: /nix/store
State directory: /nix/var/nix

Additional context

Add any other context about the problem here.

## Describe the bug Copying to a local path store doesn't work ## Steps To Reproduce ``` $ nix copy --to ./drvs --derivation nixpkgs#lixStatic error: creating file '/home/arian/Projects/attestable-nixpkgs/nix-build/drvs/nix/store/p4m9shlpsqrvir464livcdni0y32ah3f-systemwide-man-db-conf.patch': Too many open files ``` ``` $ nix copy --to file:///$PWD/drvs --derivation nixpkgs#lixStatic error: opening file '/nix/store/7mbwxq1x96m6jlqy2z3a6nq1v6whzm4m-types_psutil-7.0.0.20250801.tar.gz.drv': Too many open files ``` ## Expected behavior `nix copy` just works ## `nix --version` output ``` $ nix --version nix (Lix, like Nix) 2.93.3 System type: aarch64-linux Additional system types: Features: gc, signed-caches System configuration file: /etc/nix/nix.conf User configuration files: /home/arian/.config/nix/nix.conf:/etc/xdg/nix/nix.conf:/home/arian/.nix-profile/etc/xdg/nix/nix.conf:/nix/profile/etc/xdg/nix/nix.conf:/home/arian/.local/state/nix/profile/etc/xdg/nix/nix.conf:/etc/profiles/per-user/arian/etc/xdg/nix/nix.conf:/nix/var/nix/profiles/default/etc/xdg/nix/nix.conf:/run/current-system/sw/etc/xdg/nix/nix.conf Store directory: /nix/store State directory: /nix/var/nix ``` ## Additional context Add any other context about the problem here.
arianvp changed title from nix copy --to ./path or nix copy --to file:///path doesn't work - "Too many open files" to nix copy --derivation --to ./path or nix copy --derivation --to file:///path doesn't work - "Too many open files" 2025-10-31 10:54:46 +00:00
Author
Member

It does work after raising file limits. But feels like lix should be managing raising those file limits itself.

It does work after raising file limits. But feels like lix should be managing raising those file limits itself.
Owner

Yeah, we have a backpressure problem here. We are sending way too many requests.

Yeah, we have a backpressure problem here. We are sending way too many requests.
Owner
    auto [fdTemp, fnTemp] = createTempFile();

    AutoDelete autoDelete(fnTemp);

    auto now1 = std::chrono::steady_clock::now();

    /* Read the NAR simultaneously into a CompressionSink+FileSink (to
       write the compressed NAR to disk), into a HashSink (to get the
       NAR hash), and into a NarAccessor (to get the NAR listing). */
    HashSink fileHashSink { HashType::SHA256 };
    nar_index::Entry narIndex;
    HashSink narHashSink { HashType::SHA256 };
    {
        FdSink fileSink(fdTemp.get());
        TeeSink teeSinkCompressed { fileSink, fileHashSink };
        auto compressionSink = makeCompressionSink(
            config().compression,
            teeSinkCompressed,
            config().parallelCompression,
            config().compressionLevel
        );
        TeeSink teeSinkUncompressed { *compressionSink, narHashSink };
        AsyncTeeInputStream teeSource { narSource, teeSinkUncompressed };
        narIndex = TRY_AWAIT(nar_index::create(teeSource));
        compressionSink->finish();
        fileSink.flush();
    }

This is the root cause, we are calling this function too much at the same time as our upload process can keep up but our mechanism to bookkeep info like the nar hash, the compressed file and so on are not keeping up.

```cpp auto [fdTemp, fnTemp] = createTempFile(); AutoDelete autoDelete(fnTemp); auto now1 = std::chrono::steady_clock::now(); /* Read the NAR simultaneously into a CompressionSink+FileSink (to write the compressed NAR to disk), into a HashSink (to get the NAR hash), and into a NarAccessor (to get the NAR listing). */ HashSink fileHashSink { HashType::SHA256 }; nar_index::Entry narIndex; HashSink narHashSink { HashType::SHA256 }; { FdSink fileSink(fdTemp.get()); TeeSink teeSinkCompressed { fileSink, fileHashSink }; auto compressionSink = makeCompressionSink( config().compression, teeSinkCompressed, config().parallelCompression, config().compressionLevel ); TeeSink teeSinkUncompressed { *compressionSink, narHashSink }; AsyncTeeInputStream teeSource { narSource, teeSinkUncompressed }; narIndex = TRY_AWAIT(nar_index::create(teeSource)); compressionSink->finish(); fileSink.flush(); } ``` This is the root cause, we are calling this function too much at the same time as our upload process can keep up but our mechanism to bookkeep info like the nar hash, the compressed file and so on are not keeping up.
raito changed title from nix copy --derivation --to ./path or nix copy --derivation --to file:///path doesn't work - "Too many open files" to BinaryCache::addToStoreCommon should stream the compressed file contents 2025-10-31 16:01:02 +00:00
Author
Member

I think the renaming of this issue is inaccurate

I'm running into the issue with both file:// (which is just Binary cache store in a trench coat) but also with ./ (which is local file store. And doesn't involve NARs. Afaik)

I think the renaming of this issue is inaccurate I'm running into the issue with both `file://` (which is just Binary cache store in a trench coat) but also with `./` (which is local file store. And doesn't involve NARs. Afaik)
Owner

@arianvp I don't see how the local file store can cause this, can you provide me with your reproducer please?
But that code path is the only place that creates temporary files.

I think the fix ultimately is to rate limit concurrent store operations.

@arianvp I don't see how the local file store can cause this, can you provide me with your reproducer please? But that code path is the only place that creates temporary files. I think the fix ultimately is to rate limit concurrent store operations.
raito changed title from BinaryCache::addToStoreCommon should stream the compressed file contents to Concurrent store operations should be rate-limited 2025-10-31 21:45:29 +00:00
Owner

sigh @pennae is omniscient

        auto path2 = binaryCacheDir + "/" + path;
        static std::atomic<int> counter{0};
        Path tmp = fmt("%s.tmp.%d.%d", path2, getpid(), ++counter);
        AutoDelete del(tmp, false);
        StreamToSourceAdapter source(istream);
        writeFile(tmp, source);
        renameFile(tmp, path2);
        del.cancel();

in the local binary store

Anyway, rate limiting ANY store operation should do the trick.

*sigh* @pennae is omniscient ```cpp auto path2 = binaryCacheDir + "/" + path; static std::atomic<int> counter{0}; Path tmp = fmt("%s.tmp.%d.%d", path2, getpid(), ++counter); AutoDelete del(tmp, false); StreamToSourceAdapter source(istream); writeFile(tmp, source); renameFile(tmp, path2); del.cancel(); ``` in the local binary store Anyway, rate limiting ANY store operation should do the trick.
Author
Member

The reproducer is in the original issue:

nix copy --to ./drvs --derivation nixpkgs#lixStatic

Which creates a proper store (with a sqlite DB etc) but I also get file limit errors

The reproducer is in the original issue: ``` nix copy --to ./drvs --derivation nixpkgs#lixStatic ``` Which creates a proper store (with a sqlite DB etc) but I also get file limit errors
Owner

@arianvp We have a fix, I will submit it soon after we performance regress test it.

@arianvp We have a fix, I will submit it soon after we performance regress test it.
Member

This issue was mentioned on Gerrit on the following CLs:

  • commit message in cl/4517 ("libstore/store-api: rate-limit concurrent copies based on system limits")
<!-- GERRIT_LINKBOT: {"cls": [{"backlink": "https://gerrit.lix.systems/c/lix/+/4517", "number": 4517, "kind": "commit message"}], "cl_meta": {"4517": {"change_title": "libstore/store-api: rate-limit concurrent copies based on system limits"}}} --> This issue was mentioned on Gerrit on the following CLs: * commit message in [cl/4517](https://gerrit.lix.systems/c/lix/+/4517) ("libstore/store-api: rate-limit concurrent copies based on system limits")
Owner

@arianvp Please test again and reopen if the bug is still present.

@arianvp Please test again and reopen if the bug is still present.
raito closed this issue 2025-11-02 01:37:34 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
3 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lix-project/lix#1022
No description provided.