Wire connectionSharingAcrossClientsEnabled + Netty HTTP metrics in benchmark harness#17
Open
Wire connectionSharingAcrossClientsEnabled + Netty HTTP metrics in benchmark harness#17
Conversation
Code changes: - Add connectionSharingAcrossClientsEnabled field/getter/setter to TenantWorkloadConfig - Add switch case in applyField() so tenants.json value is properly applied - Add -connectionSharingAcrossClientsEnabled CLI parameter to Configuration (JCommander) - Apply connectionSharingAcrossClientsEnabled on CosmosClientBuilder in AsyncBenchmark - Wire through fromConfiguration() for legacy CLI path - Add to toString() for debug visibility Test plan updates: - Expand S8 from 9 to 30 scenarios (3 protocols x 5 workloads x 2 sharing modes) - Add ReadLatency and WriteLatency workloads - Add isolated vs shared connection pool dimension - Update operations per tenant to 1,000,000 - Add metrics catalog with availability status (available vs needs SDK change) - Update execution runbook B13-B42 for 30-scenario matrix - Update run-baseline-matrix.sh script for 30 scenarios
- F2: Align with analysis doc - IMDS client is now ephemeral, A25/A27 resolved - F4: Add detailed before/after table for every A1/A2 resource claim - F5: New finding - connectionSharingAcrossClientsEnabled was dead config (now fixed) - F6: New finding - Reactor Netty pool metrics and H2 stream metrics are gaps
SDK change:
- Add fixedConnectionProviderBuilder.metrics(true) in HttpClient.createFixed()
Emits reactor.netty.connection.provider.{total,active,idle,pending}.connections
gauges tagged by remote.address (hostname:port) to Micrometer globalRegistry
Benchmark change:
- Add SimpleMeterRegistry to Metrics.globalRegistry so pool metrics are queryable
- Add logPoolMetrics() helper that logs all pool metrics at POST_CREATE and POST_WORKLOAD
Shows remote.address tag to verify pooling is by hostname (not resolved IP)
- Isolated mode: pool name = 'cosmos-pool-<endpoint-host>' - Shared mode: pool name = 'cosmos-shared-pool' - Enables distinguishing pools by name in Reactor Netty metrics tags
…metrics export - Build cosmosMicrometerRegistry once in run(), add to Metrics.globalRegistry - Pass to prepareTenants() as parameter (no duplicate creation) - Reactor Netty pool metrics now export to both SimpleMeterRegistry (local) and App Insights/Graphite (if configured) via globalRegistry
New class: - NettyHttpMetricsReporter: periodically samples Reactor Netty connection pool metrics from Micrometer registry and writes to netty-pool-metrics.csv - Columns: timestamp, metric, pool_id, pool_name, remote_address, value - Started/stopped alongside the Dropwizard CsvReporter in BenchmarkOrchestrator Cleanup: - Remove logPoolMetrics() ad-hoc method and all its calls - Remove SimpleMeterRegistry (cosmosMicrometerRegistry on globalRegistry is sufficient) - Remove POOL_METRICS_TAGS debug dump - Remove unused Gauge/Meter imports
SDK: - Add COSMOS.NETTY_HTTP_CLIENT_METRICS_ENABLED system property (default false) - Add COSMOS_NETTY_HTTP_CLIENT_METRICS_ENABLED env var fallback - ConnectionProvider.metrics(true) only called when property is enabled - Generic name allows enabling future Netty HTTP metrics beyond pool gauges Benchmark: - Add -enableNettyHttpMetrics CLI flag to Configuration - Wire through BenchmarkConfig to BenchmarkOrchestrator - Orchestrator sets system property before client creation - NettyHttpMetricsReporter only starts when flag is enabled - run-baseline-matrix.sh passes -enableNettyHttpMetrics
…pplyField) Completes the http2Enabled wiring in TenantWorkloadConfig so it can be set via tenants.json globalDefaults or per-tenant overrides. HTTP/2 can also be enabled via -DCOSMOS.HTTP2_ENABLED=true system property (existing path).
The reporter was defined but never instantiated in the run() method. Now creates and starts it before PRE_CREATE, stops it in cleanup.
…gauges Without a backing registry in Metrics.globalRegistry, the CompositeMeterRegistry registers gauges but Gauge.value() returns 0. Adding a SimpleMeterRegistry ensures the gauges have actual storage for their values, making netty-pool-metrics.csv report real connection counts.
…coded 100 Flux.merge concurrency during document pre-population was hardcoded to 100, causing ~100 TCP connections to be opened per tenant regardless of the configured concurrency (typically 20). Now uses min(cfg.getConcurrency(), 100) so the number of pre-warmed connections matches the actual workload concurrency.
Add .gitignore entries for: - .github/agents/ - .github/skills/ - sdk/cosmos/azure-cosmos-benchmark/docs/ - sdk/cosmos/azure-cosmos-benchmark/scripts/ These are local-only files that should not be tracked in the repository. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…/xinlian12/azure-sdk-for-java into wireConnectionSharingInBenchmark
…/xinlian12/azure-sdk-for-java into wireConnectionSharingInBenchmark
All 50 tenants were sharing a single Timer with synchronized HdrHistogramResetOnSnapshotReservoir,
causing ~7x throughput reduction for *Latency operations (9,400 vs 68,600 ops/s).
Fix: prefix meter names with tenant ID (e.g., 'tenant-0.Latency') so each tenant
gets its own Timer instance. Throughput/failure Meters also prefixed for consistency.
Root cause: MetricRegistry.register('Latency', timer) registered ONE timer in the shared
registry. All 50 tenants' LatencySubscriber.hookOnComplete() called context.stop() which
serialized through the same synchronized reservoir.update() method.
…contention" This reverts commit 922835a.
- Move NETTY_HTTP_CLIENT_METRICS_ENABLED system property into setGlobalSystemProperties - Wrap run() lifecycle in try/finally to ensure cleanup on exceptions - Stop NettyHttpMetricsReporter and remove SimpleMeterRegistry in cleanup - Guard against zero prePopConcurrency by clamping to 1 and skipping empty list - Log IOException with full stack trace in NettyHttpMetricsReporter - Reword IMDS metadata debug message to avoid definitive claim Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This is an orchestrator-level JVM-global system property, not a per-tenant config. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Wire
connectionSharingAcrossClientsEnabledthrough the benchmark harness and add Reactor Netty HTTP client connection pool metrics for multi-tenancy testing.Code Changes
Benchmark Harness (azure-cosmos-benchmark)
New classes:
BenchmarkOrchestrator- Lifecycle orchestrator (create -> run -> close -> settle x N cycles)BenchmarkConfig- Orchestrator-level config (cycles, settle time, reporting)TenantWorkloadConfig- Per-tenant config fromtenants.jsonwithapplyField()mergingNettyHttpMetricsReporter- Samples Reactor Netty connection pool metrics to CSVThreadPrefixGaugeSet- Groups threads by name prefix for JVM statsOperationenum - Case-insensitive lookup + JCommander converterWired new config fields:
connectionSharingAcrossClientsEnabled- CLI flag + tenants.json + CosmosClientBuilderhttp2Enabled- tenants.json fieldenableNettyHttpMetrics- CLI flag to enable Reactor Netty pool metricsenableJvmStats- CLI flag for GC/thread/memory JVM gaugesFixed:
Math.min(cfg.getConcurrency(), 100). FDs: 5,100 -> 1,100 (78% reduction).SDK Changes (azure-cosmos)
Configs.java: AddCOSMOS.NETTY_HTTP_CLIENT_METRICS_ENABLEDsystem property (default false)HttpClient.java: Conditionally callConnectionProvider.metrics(true)when enabledRxDocumentClientImpl.java: Per-endpoint pool names (cosmos-pool-<host>/cosmos-shared-pool)Baseline Test Results
Test environment: Azure VM D16s_v5 (16 cores, 64 GB) in West US 2, same region as Cosmos DB accounts
Common parameters:
-Xmx8g -Xms8g, G1GCReadThroughput: HTTP/1.1 vs HTTP/2 x Isolated vs Shared
Measures aggregate ops/sec via Codahale
Meter.Throughput:
xychart-beta title "ReadThroughput: HTTP/1.1 vs HTTP/2 (ops/s, 1-min rate)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "ops/sec" 40000 --> 75000 line "H1.1" [55467,68904,70545,69989,69072,69141,69231,69159,68968,66870,65884,67137,67847,67957,68467] line "H2" [49116,56673,57755,57508,57254,57217,57164,57170,57183,57171,57214,57178,57244,57237,56784]Resource utilization over time (ReadThroughput):
xychart-beta title "CPU Usage: HTTP/1.1 vs HTTP/2 (% of 16 cores)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "CPU %" 0 --> 100 line "H1.1" [0,77,83,86,89,93,93,94,95,94,95,95,95,97,97] line "H2" [5,62,72,77,81,84,87,89,91,93,93,94,94,94,95]Resource Consumption:
/proc)ThreadMXBean)Connection Pool:
Connection utilization over time (ReadThroughput, regional endpoints):
xychart-beta title "HTTP/1.1: Active vs Idle Connections (total=1000)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "Connections" 0 --> 1100 line "Active" [681,670,711,688,693,698,678,632,725,692,655,654,664,759,676] line "Idle" [321,323,267,340,295,302,344,377,295,289,338,356,297,244,339]xychart-beta title "HTTP/2: Active Connections vs Active Streams (total TCP=800)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "Count" 0 --> 850 line "TCP conns" [720,800,800,800,800,800,800,800,800,800,800,800,800,800,800] line "H2 active conns" [0,218,216,210,228,204,236,255,165,183,208,227,201,198,220] line "Active streams" [19,765,734,642,672,706,736,761,753,719,792,728,742,720,690]total.connections)active.connectionsidle.connectionsactive.connections(carrying streams)idle.connections(no active streams)active.streamsAll accounts show the same pattern: 16 TCP connections (=
minConnectionPoolSize), 5-8 actively streaming, 13-19 active streams (~2-3 streams/connection). Base pool reports all 16 as "active" (open TCP), while H2 pool shows real stream-level usage. Multiplexing is happening but at a low ratio higher concurrency or fewer connections would increase stream density.Thread Breakdown Java threads only (mid-run snapshot via
ThreadMXBean):transport-response-bounded-elastictenant-workerpartition-availability-staleness-checkcosmos-daemon-cosmos-global-endpoint-mgrreactor-http-epollparallelcosmos-parallelboundedElastic-evictorReadLatency: HTTP/1.1 vs HTTP/2
Measures per-operation latency via Codahale
Timerwith HDR Histogram.Latency Percentiles (ms):
| P99 | 4.60 | 4.56 | 6.18 | 2.99 |
Throughput:
Latency over time (ms):
xychart-beta title "ReadLatency P50: HTTP/1.1 vs HTTP/2 (ms)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "P50 ms" 1.5 --> 2.5 line "H1.1" [2.02,1.99,2.00,1.97,1.98,1.98,1.98,1.98,1.99,1.98,1.98,1.97,1.98,1.98,1.97] line "H2" [2.31,2.11,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13,2.13]xychart-beta title "ReadLatency P99: HTTP/1.1 vs HTTP/2 (ms)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "P99 ms" 3 --> 8 line "H1.1" [4.85,4.55,4.62,4.55,4.59,4.62,4.55,4.65,4.62,4.62,4.65,4.52,4.62,4.55,4.55] line "H2" [5.70,5.77,5.73,5.73,5.70,5.73,5.80,5.77,5.73,5.73,5.67,5.67,5.67,5.80,5.73]Resource Consumption:
WriteThroughput: HTTP/1.1 vs HTTP/2 x Isolated vs Shared
Measures aggregate write ops/sec via Codahale
Meter.Throughput:
Throughput over time (1-min rate, ops/s):
xychart-beta title "WriteThroughput: HTTP/1.1 vs HTTP/2 (ops/s)" x-axis "Minutes" [1,3,5,7,9,11,13,15,17,19,21,23,25,27,29] y-axis "ops/sec" 25000 --> 75000 line "H1.1" [34511,63717,66585,66737,67877,68544,68763,68833,69028,69154,67873,68678,68903,68913,68853] line "H2" [28474,51141,53366,53708,54359,54755,54865,54788,54685,54810,54838,54768,54786,54791,54905]Resource Consumption:
Connection Pool:
WriteLatency: HTTP/1.1 vs HTTP/2
Measures per-operation write latency via Codahale
Timerwith HDR Histogram.Latency Percentiles (ms):
Throughput:
Resource Consumption:
Thread Breakdown During Active Workload (ReadThroughput, mid-run snapshot)
transport-response-bounded-elastictenant-workerpartition-availability-staleness-checkcosmos-daemon-cosmos-global-endpoint-mgrreactor-http-epollparallelcosmos-parallelboundedElastic-evictorKey Findings
F7: Per-Client Thread Cost (~6.2 threads each)
cosmos-global-endpoint-mgrpartition-availability-staleness-checktransport-response-bounded-elasticF8: Pre-Population Concurrency Fix
FDs: 5,100 -> 1,100. Connection utilization: 15% -> 67.6%. Throughput unchanged.
F9: Connection Pool Keyed by Hostname, Not IP
50 accounts -> 4 IPs -> but 100 pool slots (hostname-based).
connectionSharingAcrossClientsEnabledis a no-op for multi-account.F10: Micrometer Metrics Contention on Query Path
QueryOrderby with 50 tenants + Micrometer metrics caused extreme contention in
CosmosMicrometerMetricsOptions.getMeterOptions()andTimeWindowPercentileHistogram.recordLong(). Thread dump showed 3+ threads contending onConcurrentHashMapnode locks and histogram recording. Zero throughput for 8+ hours. Not a deadlock contention-induced CPU burn. Only affects query operations with metrics enabled.Future Optimization Opportunities