-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Summary
Add internal support for configuring a maxLifeTime (maximum connection lifetime) with per-connection jitter for HTTP connections used by the Azure Cosmos DB SDK (azure-cosmos) via the Netty HTTP client. This will be an internal implementation detail, not exposed as a public API.
Problem
Currently, there is no way to set a maximum lifetime for HTTP connections in the Cosmos DB client. Long-lived connections can cause issues in scenarios such as:
- Load balancing: Connections that persist indefinitely may not redistribute across backend instances after scaling events or rebalancing, leading to uneven load distribution.
- DNS changes: When backend IP addresses change (e.g., due to failover, migration, or infrastructure updates), existing connections continue to use stale DNS resolutions, potentially routing traffic to unavailable or incorrect endpoints.
Why not use Reactor Netty's built-in maxLifeTime?
Reactor Netty's ConnectionProvider.Builder.maxLifeTime() applies a uniform max lifetime to all connections. This can cause a "thundering herd" problem where many connections created around the same time all expire simultaneously, leading to a spike in new connection creation and potential service disruption.
Proposed Solution
Implement a custom maxLifeTime mechanism that applies a per-connection lifetime limit with jitter. Each connection will have a slightly randomized max lifetime (e.g., base lifetime ± some random offset), ensuring that connections expire gradually over time rather than all at once.
This will not be exposed as a public configuration option at this time.
Affected Packages
azure-cosmos