Connect attempts should progress in an exponential backoff manner #932
	
		Labels
		
	
	
	
	No labels
	
		
			
	
	Affects/CppNix
		
			Affects/Nightly
		
			Affects/Only nightly
		
			Affects/Stable
		
			Area/build-packaging
		
			Area/cli
		
			Area/evaluator
		
			Area/fetching
		
			Area/flakes
		
			Area/language
		
			Area/lix ci
		
			Area/nix-eval-jobs
		
			Area/profiles
		
			Area/protocol
		
			Area/releng
		
			Area/remote-builds
		
			Area/repl
		
			Area/repl/debugger
		
			Area/store
		
			bug
		
			Context
contributors
		
			Context
drive-by
		
			Context
maintainers
		
			Context
RFD
		
			crash 💥
		
			Cross Compilation
		
			devx
		
			docs
		
			Downstream Dependents
		
			E/easy
		
			E/hard
		
			E/help wanted
		
			E/reproducible
		
			E/requires rearchitecture
		
			Feature/S3
		
			imported
		
			Language/Bash
		
			Language/C++
		
			Language/NixLang
		
			Language/Python
		
			Language/Rust
		
			Needs Langver
		
			OS/Linux
		
			OS/macOS
		
			performance
		
			regression
		
			release-blocker
		
			stability
		
			Status
blocked
		
			Status
invalid
		
			Status
postponed
		
			Status
wontfix
		
			testing
		
			testing/flakey
		
			Topic/Large Scale Installations
		
			ux
		
		
	
		No milestone
		
			
		
	No project
	
		
	
	
	
	
		No assignees
		
	
	
	
	
		2 participants
	
	
		
		
	Notifications
	
		
	
	
	
		
	
	
	Due date
No due date set.
	
		Dependencies
		
		
	
	
	No dependencies set.
	
	
		
	
	
		
			Reference
		
	
	
		
	
	
			lix-project/lix#932
			
		
	
		Loading…
	
	Add table
		Add a link
		
	
		Reference in a new issue
	
	
	No description provided.
		
		Delete branch "%!s()"
	 
	Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Since
7359c39076our connect timeouts are 5s.Lix have exponential backoff for retrying requests, but not for the values of our timeouts that we pass onto curl.
This is the root cause behind #920.
To solve this, with @pennae, we suggest:
(1) Introduce
initialConnectTimeoutas a setting, set it to 1s or 5s, a low value.(2) Deprecate
connectTimeoutand rename it tomaxConnectTimeoutas a setting and bump it to the previous value or a reasonably high value.(3) Introduce a backoff logic flowing from
TransferStreamtoTransferItem, i.e.TransferStreamcomputes the actual connect timeout value usingattempts,tries,initialConnectTimeoutandmaxConnectTimeoutand some parameters for jitter which should be fixed for now and pass them on toTransferItemwhich sets them viacurl_easy_setopt.Extra caution should be paid to the NixOS tests and ensuring they pass
--offlineas much as possible not to make them super slow or we should use very low values of timeouts in our non-networked NixOS tests.This issue was mentioned on Gerrit on the following CLs: