remote builder ssh multiplexing is busted #304

Open
opened 2024-05-11 23:18:44 +00:00 by pennae · 0 comments
Owner

Describe the bug

remote builders using ssh-ng:// deadlock during connect when multiplexing is turned on. ssh:// has an explicit override to disable multiplexing in the remote builder core so it isn't affected. there also seems to be no way outside of patching the sources to enable multiplexing, so we can conclude that ssh.cc multiplexing is completely broken.

Steps To Reproduce

  1. apply:
    diff --git a/src/libstore/remote-store.hh b/src/libstore/remote-store.hh
    index b1b7f93e9..5cad05a1a 100644
    --- a/src/libstore/remote-store.hh
    +++ b/src/libstore/remote-store.hh
    @@ -22,7 +22,7 @@ struct RemoteStoreConfig : virtual StoreConfig
     {
         using StoreConfig::StoreConfig;
    
    -    const Setting<int> maxConnections{this, 1, "max-connections",
    +    const Setting<int> maxConnections{this, 10, "max-connections",
             "Maximum number of concurrent connections to the Nix daemon."};
    
         const Setting<unsigned int> maxConnectionAge{this,
    
  2. configure remote builder using ssh-ng protocol
  3. use it

Additional context

not using ssh multiplexing means that each remote build that gets triggered redoes the ssh setup phase, which takes somewhere between 200 and 500ms in addition to network delays. since these connection attempts are done serially from a single thread rather than in parallel this may account for the entirety of remote build scheduling being an absolute slog

## Describe the bug remote builders using `ssh-ng://` deadlock during connect when multiplexing is turned on. `ssh://` has an explicit override to *disable* multiplexing in the remote builder core so it isn't affected. there also seems to be no way outside of patching the sources to enable multiplexing, so we can conclude that `ssh.cc` multiplexing is completely broken. ## Steps To Reproduce 1. apply: ```diff diff --git a/src/libstore/remote-store.hh b/src/libstore/remote-store.hh index b1b7f93e9..5cad05a1a 100644 --- a/src/libstore/remote-store.hh +++ b/src/libstore/remote-store.hh @@ -22,7 +22,7 @@ struct RemoteStoreConfig : virtual StoreConfig { using StoreConfig::StoreConfig; - const Setting<int> maxConnections{this, 1, "max-connections", + const Setting<int> maxConnections{this, 10, "max-connections", "Maximum number of concurrent connections to the Nix daemon."}; const Setting<unsigned int> maxConnectionAge{this, ``` 2. configure remote builder using `ssh-ng` protocol 3. use it ## Additional context not using ssh multiplexing means that each remote build that gets triggered redoes the ssh setup phase, which takes somewhere between 200 and 500ms in addition to network delays. since these connection attempts are done *serially* from a single thread rather than in parallel this may account for the entirety of remote build scheduling being an absolute slog
pennae added the
performance
ux
bug
E/help wanted
labels 2024-05-11 23:18:44 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#304
No description provided.