Connect attempts should progress in an exponential backoff manner #932

Closed
opened 2025-07-26 19:32:07 +00:00 by raito · 1 comment
Owner

Since 7359c39076 our connect timeouts are 5s.

Lix have exponential backoff for retrying requests, but not for the values of our timeouts that we pass onto curl.

This is the root cause behind #920.

To solve this, with @pennae, we suggest:

(1) Introduce initialConnectTimeout as a setting, set it to 1s or 5s, a low value.
(2) Deprecate connectTimeout and rename it to maxConnectTimeout as a setting and bump it to the previous value or a reasonably high value.
(3) Introduce a backoff logic flowing from TransferStream to TransferItem, i.e. TransferStream computes the actual connect timeout value using attempts, tries, initialConnectTimeout and maxConnectTimeout and some parameters for jitter which should be fixed for now and pass them on to TransferItem which sets them via curl_easy_setopt.

Extra caution should be paid to the NixOS tests and ensuring they pass --offline as much as possible not to make them super slow or we should use very low values of timeouts in our non-networked NixOS tests.

Since 7359c3907643bb11ab3fccf0e919d0718bb5b545 our connect timeouts are 5s. Lix have exponential backoff for retrying requests, but not for the values of our timeouts that we pass onto curl. This is the root cause behind #920. To solve this, with @pennae, we suggest: (1) Introduce `initialConnectTimeout` as a setting, set it to 1s or 5s, a low value. (2) Deprecate `connectTimeout` and rename it to `maxConnectTimeout` as a setting and bump it to the previous value or a reasonably high value. (3) Introduce a backoff logic flowing from `TransferStream` to `TransferItem`, i.e. `TransferStream` computes the actual connect timeout value using `attempts`, `tries`, `initialConnectTimeout` and `maxConnectTimeout` and some parameters for jitter which should be fixed for now and pass them on to `TransferItem` which sets them via `curl_easy_setopt`. Extra caution should be paid to the NixOS tests and ensuring they pass `--offline` as much as possible not to make them super slow or we should use very low values of timeouts in our non-networked NixOS tests.
ma27 was assigned by raito 2025-07-26 19:32:07 +00:00
Member

This issue was mentioned on Gerrit on the following CLs:

  • commit message in cl/3856 ("libstore: exponential backoff for downloads")
  • comment in cl/3856 ("libstore: exponential backoff for downloads")
<!-- GERRIT_LINKBOT: {"cls": [{"backlink": "https://gerrit.lix.systems/c/lix/+/3856", "number": 3856, "kind": "commit message"}, {"backlink": "https://gerrit.lix.systems/c/lix/+/3856", "number": 3856, "kind": "comment"}], "cl_meta": {"3856": {"change_title": "libstore: exponential backoff for downloads"}}} --> This issue was mentioned on Gerrit on the following CLs: * commit message in [cl/3856](https://gerrit.lix.systems/c/lix/+/3856) ("libstore: exponential backoff for downloads") * comment in [cl/3856](https://gerrit.lix.systems/c/lix/+/3856) ("libstore: exponential backoff for downloads")
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: lix-project/lix#932
No description provided.