From 588f1f2ff406239b178ff035726e524025ec72c3 Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 25 Nov 2025 16:00:54 -0700 Subject: [PATCH 01/55] initial commit --- source/index.md | 1 + source/mongodb-handshake/handshake.md | 1 + source/retryable-reads/retryable-reads.md | 13 +++++++++++++ source/retryable-writes/retryable-writes.md | 19 +++++++++++++++---- 4 files changed, 30 insertions(+), 4 deletions(-) diff --git a/source/index.md b/source/index.md index 29b2a51e86..bcff8b0ffd 100644 --- a/source/index.md +++ b/source/index.md @@ -12,6 +12,7 @@ - [CRUD API](crud/crud.md) - [Causal Consistency Specification](causal-consistency/causal-consistency.md) - [Change Streams](change-streams/change-streams.md) +- [Client Backpressure](client-backpressure/client-backpressure.md) - [Client Side Encryption](client-side-encryption/client-side-encryption.md) - [Client Side Operations Timeout](client-side-operations-timeout/client-side-operations-timeout.md) - [Collation](collation/collation.md) diff --git a/source/mongodb-handshake/handshake.md b/source/mongodb-handshake/handshake.md index 24c6eea50d..5a62627326 100644 --- a/source/mongodb-handshake/handshake.md +++ b/source/mongodb-handshake/handshake.md @@ -84,6 +84,7 @@ if stable_api_configured or client_options.load_balanced: else: cmd = {"legacy hello": 1, "helloOk": 1} conn.supports_op_msg = False # Send the initial command via OP_QUERY. +cmd["backpressure"] = True cmd["client"] = client_metadata if client_options.compressors: cmd["compression"] = client_options.compressors diff --git a/source/retryable-reads/retryable-reads.md b/source/retryable-reads/retryable-reads.md index d715f774ef..a7100cbe92 100644 --- a/source/retryable-reads/retryable-reads.md +++ b/source/retryable-reads/retryable-reads.md @@ -15,6 +15,10 @@ This specification will - outline how an API for retryable read operations will be implemented in drivers - define an option to enable retryable reads for an application. +The changes in this specification are related to but distinct from the retryability behaviors defined in the client +backpressure specification, which defines a retryability mechanism for all commands under certain server conditions. +Unless otherwise noted, the changes in this specification refer only to the retryability behaviors summarized above. + ## META The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and @@ -267,6 +271,13 @@ The following pseudocode for executing retryable read commands has been adapted [the pseudocode for executing retryable write commands](../retryable-writes/retryable-writes.md#executing-retryable-write-commands) and reflects the flow described above. +> [!NOTE] +> The rules above and the pseudocode below only demonstrate the rules for retryable reads as outlined in this +> specification. For simplicity, and to make the retryable reads rules easier to follow, the pseudocode was +> intentionally unmodified. For a pseudocode block that contains both retryable reads logic as defined in this +> specification and backoff retryabilitity as defined in the client backpressure specification, see the pseudocode in +> the [Backpressure Specification](../client-backpressure/client-backpressure.md). + ```typescript /** * Checks if a connection supports retryable reads. @@ -547,6 +558,8 @@ any customers experiencing degraded performance can simply disable `retryableRea ## Changelog +- xxxx-xx-xx: Clarify handling of deprioritized servers in pseudocode. + - 2024-04-30: Migrated from reStructuredText to Markdown. - 2023-12-05: Add that any server information associated with retryable exceptions MUST reflect the originating server, diff --git a/source/retryable-writes/retryable-writes.md b/source/retryable-writes/retryable-writes.md index 609de18b92..fa613a2908 100644 --- a/source/retryable-writes/retryable-writes.md +++ b/source/retryable-writes/retryable-writes.md @@ -19,6 +19,10 @@ specification will outline how an API for retryable write operations will be imp will define an option to enable retryable writes for an application and describe how a transaction ID will be provided to write commands executed therein. +The changes in this specification are related to but distinct from the retryability behaviors defined in the client +backpressure specification, which defines a retryability mechanism for all commands under certain server conditions. +Unless otherwise noted, the changes in this specification refer only to the retryability behaviors summarized above. + ## META The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and @@ -43,10 +47,10 @@ specification. This object is always associated with a server session; however, that creating a ClientSession will not always entail creation of a new server session. The name of this object MAY vary across drivers. -**Retryable Error** +**Retryable Write Error** An error is considered retryable if it has a RetryableWriteError label in its top-level "errorLabels" field. See -[Determining Retryable Errors](#determining-retryable-errors) for more information. +[Determining Retryable Write Errors](#determining-retryable-errors) for more information. Additional terms may be defined in the [Driver Session](../sessions/driver-sessions.md) specification. @@ -102,7 +106,7 @@ In a sharded cluster, it is possible that mongos may appear to support retryable cluster do not (e.g. replica set shard is configured with feature compatibility version 3.4, a standalone is added as a new shard). In these rare cases, a write command that fans out to a shard that does not support retryable writes may partially fail and an error may be reported in the write result from mongos (e.g. `writeErrors` array in the bulk write -result). This does not constitute a retryable error. Drivers MUST relay such errors to the user. +result). This does not constitute a retryable write error. Drivers MUST relay such errors to the user. #### Supported Write Operations @@ -162,7 +166,7 @@ occurs during a write command within a transaction (excepting `commitTransation` ### Implementing Retryable Writes -#### Determining Retryable Errors +#### Determining Retryable Write Errors When connected to a MongoDB instance that supports retryable writes (versions 3.6+), the driver MUST treat all errors with the RetryableWriteError label as retryable. This error label can be found in the top-level "errorLabels" field of @@ -333,6 +337,13 @@ errors are labeled "NoWritesPerformed", then the first error should be raised. If a driver associates server information (e.g. the server address or description) with an error, the driver MUST ensure that the reported server information corresponds to the server that originated the error. +> [!NOTE] +> The rules above and the pseudocode below only demonstrate the rules for retryable writes as outlined in this +> specification. For simplicity, and to make the retryable writes rules easier to follow, the pseudocode was +> intentionally unmodified. For a pseudocode block that contains both retryable writes logic as defined in this +> specification and backoff retryabilitity as defined in the client backpressure specification, see the pseudocode in +> the [Backpressure Specification](../client-backpressure/client-backpressure.md). + The above rules are implemented in the following pseudo-code: ```typescript From e467f5bc0409b550ba3f392e6b163f97337ba901 Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 26 Nov 2025 10:15:49 -0700 Subject: [PATCH 02/55] new files --- .../client-backpressure.md | 424 ++++++++++++++++++ source/client-backpressure/tests/README.md | 15 + 2 files changed, 439 insertions(+) create mode 100644 source/client-backpressure/client-backpressure.md create mode 100644 source/client-backpressure/tests/README.md diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md new file mode 100644 index 0000000000..88dd94807e --- /dev/null +++ b/source/client-backpressure/client-backpressure.md @@ -0,0 +1,424 @@ +# Client Backpressure + +- Status: Accepted +- Minimum Server Version: N/A + +______________________________________________________________________ + +## Abstract + +This specification adds the ability for drivers to automatically retry requests that fail due to server overload errors +while applying backpressure to avoid further overloading the server. + +## META + +The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and +"OPTIONAL" in this document are to be interpreted as described in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt). + +## Specification + +### Terms + +#### Ingress Connection Rate Limiter + +A token-bucket based system introduced in MongoDB 8.2 to admit, reject or queue connection requests. It aims to prevent +connection spikes from overloading the system. + +#### Ingress Request Rate Limiter + +A token bucket based system introduced in MongoDB 8.2 to admit an operation or reject it with a System Overload Error at +the front door of a mongod/s. It aims to prevent operations spikes from overloading the system. + +#### MongoTune + +Mongotune is a policy engine outside the server (mongod or mongos) which monitors a set of metrics (MongoDB or system +host) to dynamically configure MongoDB settings. MongoTune is deployed to Atlas clusters and will dynamically configure +the connection and request rate limiters to prevent and mitigate overloading the system. + +#### RetryableError label + +An error is considered retryable if it includes the "RetryableError" label. This error label indicates that an operation +is safely retryable regardless of the type of operation, its metadata, or any of its arguments. + +#### SystemOverloadedError label + +An error is considered overloaded if it includes the "SystemOverloadError" label. This error label indicates that the +server is overloaded. If this error label is present, drivers will backoff before attempting a retry. + +#### Overload Errors + +An overload error is any command or network error that occurs due to a server overload. For example, when a request +exceeds the ingress request rate limit: + +```js +{ + 'ok': 0.0, + 'errmsg': "Rate limiter 'ingressRequestRateLimiter' rate exceeded", + 'code': 462, + 'codeName': 'IngressRequestRateLimitExceeded', + 'errorLabels': ['SystemOverloadedError', 'RetryableError'], +} +``` + +When a new connection attempt exceeds the ingress connection rate limit, the server closes the TCP connection before TLS +handshake is complete. Drivers will observe this as a network error (e.g. "connection reset by peer" or "connection +closed"). + +When a new connection attempt is queued by the server for so long that the driver-side timeout expires, drivers will +observe this as a network timeout error. + +### Requirements for Client Backpressure + +#### Overload retry policy + +This specification expands the driver's retry ability to all commands, including those not currently considered +retryable such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys +the following rules: + +1. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. + - The value is 0.1 and non-configurable. +2. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. +3. If a retry attempt fails with an error that does not include `SystemOverloadedError` label, drivers MUST deposit 1 + token. +4. A retry attempt will only be permitted if the error includes the `RetryableError` label, we have not reached + `MAX_ATTEMPTS`, the CSOT deadline has not expired, and a token can be acquired from the token bucket. + - The value of `MAX_ATTEMPTS` is 5 and non-configurable. + - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the + timeout to avoid retry storms. +5. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according + to according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` + - `i` is the retry attempt (starting with 0 for the first retry). + - `j` is a random jitter value between 0 and 1. + - `baseBackoff` is constant 100ms. + - `maxBackoff` is 10000ms. + - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. +6. If the previous error contained the `SystemOverloadedError` error label, the node will be added to the set of + deprioritized servers. + +#### Pseudocode + +The following pseudocode describes the overload retry policy: + +```python +BASE_BACKOFF = 0.1 +MAX_BACKOFF = 10 +RETRY_TOKEN_RETURN_RATE = 0.1 + +def execute_command_retryable(command, ...): + deprioritized_servers = [] + attempt = 0 + while True: + try: + server = select_server(deprioritized_servers) + connection = server.getConnection() + res = execute_command(connection, command) + # Return tokens to the bucket on success. + tokens = RETRY_TOKEN_RETURN_RATE + if attempt > 0: + tokens += 1 + token_bucket.deposit(tokens) + return res + except PyMongoError as exc: + backoff = 0 + attempt += 1 + + if attempt > MAX_ATTEMPTS: + raise + + # Raise if the error is non retryable. + is_retryable = exc.has_error_label("RetryableError") or is_retryable_write_error() or is_retryable_read_error() + if not is_retryable: + raise error + if exc.has_error_label("SystemOverloadedError"): + jitter = random.random() # Random float between [0.0, 1.0). + backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) + + # If the delay exceeds the deadline, bail early before consuming a token. + if _csot.get_timeout(): + if time.monotonic() + backoff > _csot.get_deadline(): + raise + + if not token_bucket.consume(1): + raise + + if backoff: + time.sleep(backoff) + deprioritized_servers.append(server) + continue +``` + +Some drivers might not have retryability implementations that allow easy separation of the existing retryable +reads/writes mechanisms from the exponential backoff and jitter retry algorithm. An example pseudocode is defined below +that demonstrates a combined retryable reads/writes implementation with the corresponding backpressure changes (adapted +from the Node driver's implementation): + +```typescript +async function tryOperation>( + operation: T, + { topology, timeoutContext, session, readPreference }: RetryOptions +): Promise { + const serverSelector = getServerSelectorForReadPreference(operation, readPreference); + + let server = await topology.selectServer(selector, { + session, + }); + + const hasReadAspect = operation.hasAspect(Aspect.READ_OPERATION); + const hasWriteAspect = operation.hasAspect(Aspect.WRITE_OPERATION); + const inTransaction = session?.inTransaction() ?? false; + + const willRetryRead = topology.s.options.retryReads && !inTransaction && operation.canRetryRead; + + const willRetryWrite = + topology.s.options.retryWrites && + !inTransaction && + supportsRetryableWrites(server) && + operation.canRetryWrite; + + const willRetry = + operation.hasAspect(Aspect.RETRYABLE) && + session != null && + ((hasReadAspect && willRetryRead) || (hasWriteAspect && willRetryWrite)); + + if (hasWriteAspect && willRetryWrite && session != null) { + operation.options.willRetryWrite = true; + session.incrementTransactionNumber(); + } + + // The maximum number of retry attempts using regular retryable reads/writes logic (not including + // SystemOverLoad error retries). + const maxNonOverloadRetryAttempts = willRetry + ? timeoutMS != null + ? Infinity + : 2 + : 1; + + let previousOperationError: MongoError | undefined; + let previousServer: ServerDescription | undefined; + + let nonOverloadRetryAttempt = 0; + let systemOverloadRetryAttempt = 0; + + const maxSystemOverloadRetryAttempts = 5; + const backoffDelayProvider = exponentialBackoffDelayProvider( + 10_000, // MAX_BACKOFF + 100, // base backoff + 2 // backoff rate + ); + + while (true) { + if (previousOperationError) { + if (previousOperationError.hasErrorLabel("SystemOverloadError")) { + systemOverloadRetryAttempt += 1; + + if ( + // if the SystemOverloadError is not retryable, throw. + !previousOperationError.hasErrorLabel("RetryableError") || + !( + // if retryable writes or reads are not configured, throw. + ( + (hasReadAspect && topology.s.options.retryReads) || + (hasWriteAspect && topology.s.options.retryWrites) + ) + ) + ) { + throw previousOperationError; + } + + // if we have exhausted overload retry attempts, throw. + if (systemOverloadRetryAttempt > maxSystemOverloadRetryAttempts) { + throw previousOperationError; + } + + const { value: delayMS } = backoffDelayProvider.next(); + + // if the delay would exhaust the CSOT timeout, short-circuit. + if (timeoutContext.csotEnabled() && delayMS > timeoutContext.remainingTimeMS) { + throw previousError; + } + + await setTimeout(delayMS); + + // attempt to consume a retry token, throw if we don't have budget. + if (!topology.tokenBucket.consume(RETRY_COST)) { + throw previousOperationError; + } + + server = await topology.selectServer(selector, { session }); + } else { + nonOverloadRetryAttempt++; + // we have no more retry attempts, throw. + if (nonOverloadRetryAttempt > maxNonOverloadRetryAttempts) { + throw previousOperationError; + } + + // Handle MMAPv1 not supporting retryable writes. + if (hasWriteAspect && previousOperationError.code === MMAPv1_RETRY_WRITES_ERROR_CODE) { + throw new MongoServerError({ + message: MMAPv1_RETRY_WRITES_ERROR_MESSAGE, + errmsg: MMAPv1_RETRY_WRITES_ERROR_MESSAGE, + originalError: previousOperationError + }); + } + + // handle non-retryable errors + if ( + (hasWriteAspect && !isRetryableWriteError(previousOperationError)) || + (hasReadAspect && !isRetryableReadError(previousOperationError)) + ) { + throw previousOperationError; + } + + server = await topology.selectServer(selector, { session }); + + // handle rare downgrade scenarios where some nodes don't support + // retryable writes but others do. + if (hasWriteAspect && !supportsRetryableWrites(server)) { + throw new MongoUnexpectedServerResponseError( + 'Selected server does not support retryable writes' + ); + } + } + } + + try { + try { + const result = await server.command(operation, timeoutContext); + const isRetry = nonOverloadRetryAttempt > 0 || systemOverloadRetryAttempt > 0; + topology.tokenBucket.deposit( + isRetry + ? // on successful retry, deposit the retry cost + the refresh rate. + TOKEN_REFRESH_RATE + RETRY_COST + : // otherwise, just deposit the refresh rate. + TOKEN_REFRESH_RATE + ); + return operation.handleOk(result); + } catch (error) { + return operation.handleError(error); + } + } catch (operationError) { + if (!operationError.hasErrorLabel("SystemOverloadError")) { + // if an operation fails with an error that does not contain the SystemOverloadError, deposit 1 token. + topology.tokenBucket.deposit(RETRY_COST); + } + + if ( + previousOperationError != null && + operationError.hasErrorLabel("NoWritesPerformed") + ) { + throw previousOperationError; + } + previousServer = server.description; + previousOperationError = operationError; + } + } +} +``` + +### Token Bucket + +The overload retry policy introduces a per-client token bucket to limit retry attempts. Although the server rejects +excess operations as quickly as possible, doing so costs CPU and creates extra contention on the connection pool which +can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry attempts during a +prolonged overload. + +The token bucket capacity is set to 1000 for consistency with the server. + +#### Pseudocode + +The token bucket is implemented via a thread safe counter. For languages without atomics, this can be implemented via a +lock, for example: + +```python +DEFAULT_RETRY_TOKEN_CAPACITY = 1000 +class TokenBucket: + """A token bucket implementation for rate limiting.""" + def __init__( + self, + capacity: float = DEFAULT_RETRY_TOKEN_CAPACITY, + ): + self.lock = Lock() + self.capacity = capacity + self.tokens = capacity + + def consume(self, n: float) -> bool: + """Consume n tokens from the bucket if available.""" + with self.lock: + if self.tokens >= n: + self.tokens -= n + return True + return False + + def deposit(self, n: float) -> None: + """Deposit n tokens back into the bucket.""" + with self.lock: + self.tokens = min(self.capacity, self.tokens + n) +``` + +#### Handshake changes + +Drivers conforming to this spec MUST add `“backpressure”: True` to the connection handshake. This flag allows the server +to identify clients which do and do not support backpressure. Currently, this flag is unused but in the future the +server may offer different rate limiting behavior for clients that do not support backpressure. + +##### Implementation notes + +On some platforms sleep() can have a very low precision, meaning an attempt to sleep for 50ms may actually sleep for a +much larger time frame. Drivers are not required to work around this limitation. + +### Logging Retry Attempts + +[As with retryable writes](../retryable-writes/retryable-writes.md#logging-retry-attempts), drivers MAY choose to log +retry attempts for load shed operations. This specification does not define a format for such log messages. + +### Command Monitoring + +[As with retryable writes](../retryable-writes/retryable-writes.md#command-monitoring), in accordance with the +[Command Logging and Monitoring](../command-logging-and-monitoring/command-logging-and-monitoring.md) specification, +drivers MUST guarantee that each `CommandStartedEvent` has either a correlating `CommandSucceededEvent` or +`CommandFailedEvent` and that every "command started" log message has either a correlating "command succeeded" log +message or "command failed" log message. If the first attempt of a retryable operation encounters a retryable error, +drivers MUST fire a `CommandFailedEvent` and emit a "command failed" log message for the retryable error and fire a +separate `CommandStartedEvent` and emit a separate "command started" log message when executing the subsequent retry +attempt. Note that the second `CommandStartedEvent` and "command started" log message may have a different +`connectionId`, since a server is reselected for a retry attempt. + +### Documentation + +1. Drivers MUST document that all operations support retries on server overload. +2. Driver release notes MUST make it clear to users that they may need to adjust custom retry logic to prevent an + application from inadvertently retrying for too long (see [Backwards Compatibility](#backwards-compatibility) for + details). + +## Test Plan + +See the [README](./tests/README.md) for tests. + +## Motivation for Change + +New load shedding mechanisms are being introduced to the server that improve its ability to remain available under +extreme load, however clients do not know how to handle the errors returned when one of its requests has been rejected. +As a result, such overload errors would currently either be propagated back to applications, increasing +externally-visible command failure rates, or be retried immediately, increasing the load on already overburdened +servers. To minimize these effects, this specification enables clients to retry requests that have been load shed in a +way that does not overburden already overloaded servers. This retry behavior allows for more aggressive and effective +load shedding policies to be deployed in the future. This will also help unify the currently-divergent retry behavior +between drivers and the server (mongos). + +## Reference Implementation + +The Node and Python drivers will provide the reference implementations. See +[NODE-7142](https://jira.mongodb.org/browse/NODE-7142) and [PYTHON-5528](https://jira.mongodb.org/browse/PYTHON-5528). + +## Future work + +1. [DRIVERS-3333](https://jira.mongodb.org/browse/DRIVERS-3333) Add a backoff state into the connection pool. +2. [DRIVERS-3241](https://jira.mongodb.org/browse/DRIVERS-3241) Add diagnostic metadata to retried commands. + +## Q&A + +TODO + +## Changelog + +- 2025-XX-XX: Initial version. diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md new file mode 100644 index 0000000000..7a70e4ad76 --- /dev/null +++ b/source/client-backpressure/tests/README.md @@ -0,0 +1,15 @@ +# Client Backpressure Tests + +______________________________________________________________________ + +## Introduction + +The YAML and JSON files in this directory are platform-independent tests meant to exercise a driver's implementation of +retryable reads. These tests utilize the [Unified Test Format](../../unified-test-format/unified-test-format.md). + +Several prose tests, which are not easily expressed in YAML, are also presented in this file. Those tests will need to +be manually implemented by each driver. + +## Changelog + +- 2025-XX-XX: Initial version. From d55fdb9ebbcecc5f9f15d16b8d9d5151495b3713 Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 26 Nov 2025 14:25:13 -0700 Subject: [PATCH 03/55] add tests for handshake changes --- source/mongodb-handshake/tests/README.md | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/source/mongodb-handshake/tests/README.md b/source/mongodb-handshake/tests/README.md index d88a49fa92..0ba58b713a 100644 --- a/source/mongodb-handshake/tests/README.md +++ b/source/mongodb-handshake/tests/README.md @@ -486,3 +486,17 @@ Before each test case, perform the setup. 7. Store the response as `updatedClientMetadata`. 8. Assert that `initialClientMetadata` is identical to `updatedClientMetadata`. + +### Test 9: Handshake documents include `backpressure: true` + +These tests require a mechanism for observing handshake documents sent to the server. + +1. Create a `MongoClient` that is configured to record all handshake documents sent to the server as a part of + connection establishment. + +2. Send a `ping` command to the server and verify that the command succeeds. This ensure that a connection is + established on all topologies. + +3. Assert that for every handshake document intercepted: + + 1. the document has a field `backpressure` whose value is `true`. From 8e74b418404a573d6d622122e031b088871d169b Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 1 Dec 2025 15:59:14 -0700 Subject: [PATCH 04/55] add generated tests --- .../tests/backpressure-retry-loop.json | 3403 +++++++++++++++++ .../tests/backpressure-retry-loop.yml | 1860 +++++++++ .../backpressure-retry-loop.yml.template | 128 + .../backpressure-retry-max-attempts.json | 1262 ++++++ .../tests/backpressure-retry-max-attempts.yml | 753 ++++ ...ckpressure-retry-max-attempts.yml.template | 72 + ...enerate-backpressure-retryability-tests.py | 125 + 7 files changed, 7603 insertions(+) create mode 100644 source/client-backpressure/tests/backpressure-retry-loop.json create mode 100644 source/client-backpressure/tests/backpressure-retry-loop.yml create mode 100644 source/client-backpressure/tests/backpressure-retry-loop.yml.template create mode 100644 source/client-backpressure/tests/backpressure-retry-max-attempts.json create mode 100644 source/client-backpressure/tests/backpressure-retry-max-attempts.yml create mode 100644 source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template create mode 100644 source/etc/generate-backpressure-retryability-tests.py diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json new file mode 100644 index 0000000000..21fc802344 --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -0,0 +1,3403 @@ +{ + "description": "tests that operations respect overload backoff retry loop", + "schemaVersion": "1.0", + "runOnRequirements": [ + { + "minServerVersion": "4.4", + "topologies": [ + "replicaset", + "sharded", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client", + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent", + "commandSucceededEvent", + "commandFailedEvent" + ] + } + }, + { + "client": { + "id": "failPointClient", + "useMultipleMongoses": false + } + }, + { + "database": { + "id": "utilDb", + "client": "failPointClient", + "databaseName": "retryable-writes-tests" + } + }, + { + "collection": { + "id": "utilCollection", + "database": "utilDb", + "collectionName": "coll" + } + }, + { + "database": { + "id": "database", + "client": "client", + "databaseName": "retryable-writes-tests" + } + }, + { + "collection": { + "id": "collection", + "database": "database", + "collectionName": "coll" + } + } + ], + "initialData": [ + { + "collectionName": "coll", + "databaseName": "retryable-writes-tests", + "documents": [ + { + "_id": 1, + "x": 11 + }, + { + "_id": 2, + "x": 22 + } + ] + } + ], + "tests": [ + { + "description": "client.listDatabases retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listDatabases" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "listDatabases", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandSucceededEvent": { + "commandName": "listDatabases" + } + } + ] + } + ] + }, + { + "description": "client.listDatabaseNames retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listDatabases" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "listDatabaseNames", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandSucceededEvent": { + "commandName": "listDatabases" + } + } + ] + } + ] + }, + { + "description": "client.createChangeStream retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "client.clientBulkWrite retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "bulkWrite" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "clientBulkWrite", + "arguments": { + "models": [ + { + "insertOne": { + "namespace": "retryable-writes-tests.coll", + "document": { + "_id": 8, + "x": 88 + } + } + } + ] + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandSucceededEvent": { + "commandName": "bulkWrite" + } + } + ] + } + ] + }, + { + "description": "database.aggregate retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "aggregate", + "arguments": { + "pipeline": [ + { + "$listLocalSessions": {} + }, + { + "$limit": 1 + } + ] + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "database.listCollections retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listCollections" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "listCollections", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandSucceededEvent": { + "commandName": "listCollections" + } + } + ] + } + ] + }, + { + "description": "database.listCollectionNames retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listCollections" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "listCollectionNames", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandSucceededEvent": { + "commandName": "listCollections" + } + } + ] + } + ] + }, + { + "description": "database.runCommand retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "ping" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "runCommand", + "arguments": { + "command": { + "ping": 1 + }, + "commandName": "ping" + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandSucceededEvent": { + "commandName": "ping" + } + } + ] + } + ] + }, + { + "description": "database.createChangeStream retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "collection.aggregate retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "aggregate", + "arguments": { + "pipeline": [] + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "collection.countDocuments retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "countDocuments", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "collection.estimatedDocumentCount retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "count" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "estimatedDocumentCount", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandSucceededEvent": { + "commandName": "count" + } + } + ] + } + ] + }, + { + "description": "collection.distinct retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "distinct" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "distinct", + "arguments": { + "fieldName": "x", + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandSucceededEvent": { + "commandName": "distinct" + } + } + ] + } + ] + }, + { + "description": "collection.find retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "find", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandSucceededEvent": { + "commandName": "find" + } + } + ] + } + ] + }, + { + "description": "collection.findOne retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOne", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandSucceededEvent": { + "commandName": "find" + } + } + ] + } + ] + }, + { + "description": "collection.listIndexes retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "listIndexes", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandSucceededEvent": { + "commandName": "listIndexes" + } + } + ] + } + ] + }, + { + "description": "collection.listIndexNames retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "listIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "listIndexNames", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandSucceededEvent": { + "commandName": "listIndexes" + } + } + ] + } + ] + }, + { + "description": "collection.createChangeStream retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandSucceededEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "collection.insertOne retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "insertOne", + "arguments": { + "document": { + "_id": 2, + "x": 22 + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandSucceededEvent": { + "commandName": "insert" + } + } + ] + } + ] + }, + { + "description": "collection.insertMany retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "insertMany", + "arguments": { + "documents": [ + { + "_id": 2, + "x": 22 + } + ] + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandSucceededEvent": { + "commandName": "insert" + } + } + ] + } + ] + }, + { + "description": "collection.deleteOne retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "delete" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "deleteOne", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandSucceededEvent": { + "commandName": "delete" + } + } + ] + } + ] + }, + { + "description": "collection.deleteMany retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "delete" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "deleteMany", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandSucceededEvent": { + "commandName": "delete" + } + } + ] + } + ] + }, + { + "description": "collection.replaceOne retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "replaceOne", + "arguments": { + "filter": {}, + "replacement": { + "x": 22 + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandSucceededEvent": { + "commandName": "update" + } + } + ] + } + ] + }, + { + "description": "collection.updateOne retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "updateOne", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandSucceededEvent": { + "commandName": "update" + } + } + ] + } + ] + }, + { + "description": "collection.updateMany retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "updateMany", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandSucceededEvent": { + "commandName": "update" + } + } + ] + } + ] + }, + { + "description": "collection.findOneAndDelete retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndDelete", + "arguments": { + "filter": {} + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandSucceededEvent": { + "commandName": "findAndModify" + } + } + ] + } + ] + }, + { + "description": "collection.findOneAndReplace retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndReplace", + "arguments": { + "filter": {}, + "replacement": { + "x": 22 + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandSucceededEvent": { + "commandName": "findAndModify" + } + } + ] + } + ] + }, + { + "description": "collection.findOneAndUpdate retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndUpdate", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandSucceededEvent": { + "commandName": "findAndModify" + } + } + ] + } + ] + }, + { + "description": "collection.bulkWrite retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "bulkWrite", + "arguments": { + "requests": [ + { + "insertOne": { + "document": { + "_id": 2, + "x": 22 + } + } + } + ] + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandSucceededEvent": { + "commandName": "insert" + } + } + ] + } + ] + }, + { + "description": "collection.createIndex retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "createIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "createIndex", + "arguments": { + "keys": { + "x": 11 + }, + "name": "x_11" + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandSucceededEvent": { + "commandName": "createIndexes" + } + } + ] + } + ] + }, + { + "description": "collection.dropIndex retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "object": "utilCollection", + "name": "createIndex", + "arguments": { + "keys": { + "x": 11 + }, + "name": "x_11" + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "dropIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "dropIndex", + "arguments": { + "name": "x_11" + }, + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandSucceededEvent": { + "commandName": "dropIndexes" + } + } + ] + } + ] + }, + { + "description": "collection.dropIndexes retries using operation loop", + "operations": [ + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "filter": {} + } + }, + { + "object": "utilCollection", + "name": "deleteMany", + "arguments": { + "documents": [ + { + "_id": 1, + "x": 11 + } + ] + } + }, + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "dropIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "dropIndexes", + "expectError": false + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandSucceededEvent": { + "commandName": "dropIndexes" + } + } + ] + } + ] + } + ] +} diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml new file mode 100644 index 0000000000..ec2fa25b8f --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -0,0 +1,1860 @@ +# Tests in this file are generated from backpressure-retry-loop.yml.template. + +description: tests that operations respect overload backoff retry loop + +schemaVersion: '1.0' + +runOnRequirements: + - + minServerVersion: '4.4' # failCommand + topologies: [replicaset, sharded, load-balanced] + +createEntities: + - + client: + id: &client client + useMultipleMongoses: false + observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + + - + client: + id: &failPointClient failPointClient + useMultipleMongoses: false + + - + database: + id: &utilDb utilDb + client: *failPointClient + databaseName: &database_name retryable-writes-tests + + - + collection: + id: &utilCollection utilCollection + database: *utilDb + collectionName: &collection_name coll + + - + database: + id: &database database + client: *client + databaseName: &database_name retryable-writes-tests + - + collection: + id: &collection collection + database: *database + collectionName: &collection_name coll + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: + - { _id: 1, x: 11 } + - { _id: 2, x: 22 } + +tests: + + - + description: 'client.listDatabases retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listDatabases] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: listDatabases + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandSucceededEvent: + commandName: listDatabases + + - + description: 'client.listDatabaseNames retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listDatabases] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: listDatabaseNames + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandSucceededEvent: + commandName: listDatabases + + - + description: 'client.createChangeStream retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'client.clientBulkWrite retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [bulkWrite] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: clientBulkWrite + arguments: + models: + - insertOne: + namespace: retryable-writes-tests.coll + document: { _id: 8, x: 88 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandSucceededEvent: + commandName: bulkWrite + + - + description: 'database.aggregate retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: aggregate + arguments: + pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'database.listCollections retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listCollections] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: listCollections + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandSucceededEvent: + commandName: listCollections + + - + description: 'database.listCollectionNames retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listCollections] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: listCollectionNames + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandSucceededEvent: + commandName: listCollections + + - + description: 'database.runCommand retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [ping] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: runCommand + arguments: + command: { ping: 1 } + commandName: ping + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandSucceededEvent: + commandName: ping + + - + description: 'database.createChangeStream retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'collection.aggregate retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: aggregate + arguments: + pipeline: [] + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'collection.countDocuments retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: countDocuments + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'collection.estimatedDocumentCount retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [count] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: estimatedDocumentCount + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandSucceededEvent: + commandName: count + + - + description: 'collection.distinct retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [distinct] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: distinct + arguments: + fieldName: x + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandSucceededEvent: + commandName: distinct + + - + description: 'collection.find retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [find] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: find + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandSucceededEvent: + commandName: find + + - + description: 'collection.findOne retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [find] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOne + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandSucceededEvent: + commandName: find + + - + description: 'collection.listIndexes retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: listIndexes + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandSucceededEvent: + commandName: listIndexes + + - + description: 'collection.listIndexNames retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [listIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: listIndexNames + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandSucceededEvent: + commandName: listIndexes + + - + description: 'collection.createChangeStream retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandSucceededEvent: + commandName: aggregate + + - + description: 'collection.insertOne retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: insertOne + arguments: + document: { _id: 2, x: 22 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandSucceededEvent: + commandName: insert + + - + description: 'collection.insertMany retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: insertMany + arguments: + documents: + - { _id: 2, x: 22 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandSucceededEvent: + commandName: insert + + - + description: 'collection.deleteOne retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [delete] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: deleteOne + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandSucceededEvent: + commandName: delete + + - + description: 'collection.deleteMany retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [delete] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: deleteMany + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandSucceededEvent: + commandName: delete + + - + description: 'collection.replaceOne retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: replaceOne + arguments: + filter: {} + replacement: { x: 22 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandSucceededEvent: + commandName: update + + - + description: 'collection.updateOne retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: updateOne + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandSucceededEvent: + commandName: update + + - + description: 'collection.updateMany retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: updateMany + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandSucceededEvent: + commandName: update + + - + description: 'collection.findOneAndDelete retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndDelete + arguments: + filter: {} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandSucceededEvent: + commandName: findAndModify + + - + description: 'collection.findOneAndReplace retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndReplace + arguments: + filter: {} + replacement: { x: 22 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandSucceededEvent: + commandName: findAndModify + + - + description: 'collection.findOneAndUpdate retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndUpdate + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandSucceededEvent: + commandName: findAndModify + + - + description: 'collection.bulkWrite retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: bulkWrite + arguments: + requests: + - insertOne: + document: { _id: 2, x: 22 } + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandSucceededEvent: + commandName: insert + + - + description: 'collection.createIndex retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [createIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: createIndex + arguments: + keys: { x: 11 } + name: "x_11" + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandSucceededEvent: + commandName: createIndexes + + - + description: 'collection.dropIndex retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + - + object: *utilCollection + name: createIndex + arguments: + keys: { x: 11 } + name: "x_11" + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [dropIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: dropIndex + arguments: + name: "x_11" + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandSucceededEvent: + commandName: dropIndexes + + - + description: 'collection.dropIndexes retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [dropIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: dropIndexes + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandSucceededEvent: + commandName: dropIndexes diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template new file mode 100644 index 0000000000..049cbbac3a --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -0,0 +1,128 @@ +# Tests in this file are generated from backpressure-retry-loop.yml.template. + +description: tests that operations respect overload backoff retry loop + +schemaVersion: '1.0' + +runOnRequirements: + - + minServerVersion: '4.4' # failCommand + topologies: [replicaset, sharded, load-balanced] + +createEntities: + - + client: + id: &client client + useMultipleMongoses: false + observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + + - + client: + id: &failPointClient failPointClient + useMultipleMongoses: false + + - + database: + id: &utilDb utilDb + client: *failPointClient + databaseName: &database_name retryable-writes-tests + + - + collection: + id: &utilCollection utilCollection + database: *utilDb + collectionName: &collection_name coll + + - + database: + id: &database database + client: *client + databaseName: &database_name retryable-writes-tests + - + collection: + id: &collection collection + database: *database + collectionName: &collection_name coll + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: + - { _id: 1, x: 11 } + - { _id: 2, x: 22 } + +tests: +{% for operation in operations %} + - + description: '{{operation.object}}.{{operation.operation_name}} retries using operation loop' + operations: + - + object: *utilCollection + name: deleteMany + arguments: + filter: {} + + - + object: *utilCollection + name: deleteMany + arguments: + documents: + - { _id: 1, x: 11 } + + {%- if operation.operation_name == "dropIndex" %} + - + object: *utilCollection + name: createIndex + arguments: + keys: { x: 11 } + name: "x_11" + {%- endif %} + + + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [{{operation.command_name}}] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *{{operation.object}} + name: {{operation.operation_name}} + {%- if operation.arguments|length > 0 %} + arguments: + {%- for arg in operation.arguments %} + {{arg}} + {%- endfor -%} + {%- endif %} + {%- if operation.operation_name == "createChangeStream" %} + saveResultAsEntity: changeStream + {%- endif %} + expectError: false + + expectEvents: + - client: "client" + events: + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandSucceededEvent: + commandName: {{operation.command_name}} +{% endfor -%} diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json new file mode 100644 index 0000000000..36afc55cc5 --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -0,0 +1,1262 @@ +{ + "description": "tests that operations retry at most maxAttempts=5 times", + "schemaVersion": "1.0", + "runOnRequirements": [ + { + "minServerVersion": "4.4", + "topologies": [ + "replicaset", + "sharded", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client", + "useMultipleMongoses": false + } + }, + { + "client": { + "id": "failPointClient", + "useMultipleMongoses": false + } + }, + { + "database": { + "id": "database", + "client": "client", + "databaseName": "retryable-writes-tests" + } + }, + { + "collection": { + "id": "collection", + "database": "database", + "collectionName": "coll" + } + } + ], + "initialData": [ + { + "collectionName": "coll", + "databaseName": "retryable-writes-tests", + "documents": [ + { + "_id": 1, + "x": 11 + }, + { + "_id": 2, + "x": 22 + } + ] + } + ], + "tests": [ + { + "description": "client.listDatabases retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listDatabases" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "listDatabases", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "client.listDatabaseNames retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listDatabases" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "listDatabaseNames", + "expectError": true + } + ] + }, + { + "description": "client.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": true + } + ] + }, + { + "description": "client.clientBulkWrite retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "bulkWrite" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "client", + "name": "clientBulkWrite", + "arguments": { + "models": [ + { + "insertOne": { + "namespace": "retryable-writes-tests.coll", + "document": { + "_id": 8, + "x": 88 + } + } + } + ] + }, + "expectError": true + } + ] + }, + { + "description": "database.aggregate retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "aggregate", + "arguments": { + "pipeline": [ + { + "$listLocalSessions": {} + }, + { + "$limit": 1 + } + ] + }, + "expectError": true + } + ] + }, + { + "description": "database.listCollections retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listCollections" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "listCollections", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "database.listCollectionNames retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listCollections" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "listCollectionNames", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "database.runCommand retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "ping" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "runCommand", + "arguments": { + "command": { + "ping": 1 + }, + "commandName": "ping" + }, + "expectError": true + } + ] + }, + { + "description": "database.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "database", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": true + } + ] + }, + { + "description": "collection.aggregate retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "aggregate", + "arguments": { + "pipeline": [] + }, + "expectError": true + } + ] + }, + { + "description": "collection.countDocuments retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "countDocuments", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.estimatedDocumentCount retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "count" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "estimatedDocumentCount", + "expectError": true + } + ] + }, + { + "description": "collection.distinct retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "distinct" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "distinct", + "arguments": { + "fieldName": "x", + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.find retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "find", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.findOne retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOne", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.listIndexes retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "listIndexes", + "expectError": true + } + ] + }, + { + "description": "collection.listIndexNames retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "listIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "listIndexNames", + "expectError": true + } + ] + }, + { + "description": "collection.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "createChangeStream", + "arguments": { + "pipeline": [] + }, + "saveResultAsEntity": "changeStream", + "expectError": true + } + ] + }, + { + "description": "collection.insertOne retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "insertOne", + "arguments": { + "document": { + "_id": 2, + "x": 22 + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.insertMany retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "insertMany", + "arguments": { + "documents": [ + { + "_id": 2, + "x": 22 + } + ] + }, + "expectError": true + } + ] + }, + { + "description": "collection.deleteOne retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "delete" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "deleteOne", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.deleteMany retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "delete" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "deleteMany", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.replaceOne retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "replaceOne", + "arguments": { + "filter": {}, + "replacement": { + "x": 22 + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.updateOne retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "updateOne", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.updateMany retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "update" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "updateMany", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.findOneAndDelete retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndDelete", + "arguments": { + "filter": {} + }, + "expectError": true + } + ] + }, + { + "description": "collection.findOneAndReplace retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndReplace", + "arguments": { + "filter": {}, + "replacement": { + "x": 22 + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.findOneAndUpdate retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "findAndModify" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "findOneAndUpdate", + "arguments": { + "filter": {}, + "update": { + "$set": { + "x": 22 + } + } + }, + "expectError": true + } + ] + }, + { + "description": "collection.bulkWrite retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "bulkWrite", + "arguments": { + "requests": [ + { + "insertOne": { + "document": { + "_id": 2, + "x": 22 + } + } + } + ] + }, + "expectError": true + } + ] + }, + { + "description": "collection.createIndex retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "createIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "createIndex", + "arguments": { + "keys": { + "x": 11 + }, + "name": "x_11" + }, + "expectError": true + } + ] + }, + { + "description": "collection.dropIndex retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "dropIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "dropIndex", + "arguments": { + "name": "x_11" + }, + "expectError": true + } + ] + }, + { + "description": "collection.dropIndexes retries at most maxAttempts times (maxAttempts=5)", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 6 + }, + "data": { + "failCommands": [ + "dropIndexes" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "object": "collection", + "name": "dropIndexes", + "expectError": true + } + ] + } + ] +} diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml new file mode 100644 index 0000000000..998b48c78e --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -0,0 +1,753 @@ +# Tests in this file are generated from backpressure-retry-loop.yml.template. + +description: tests that operations retry at most maxAttempts=5 times + +schemaVersion: '1.0' + +runOnRequirements: + - + minServerVersion: '4.4' # failCommand + topologies: [replicaset, sharded, load-balanced] + +createEntities: + - + client: + id: &client client + useMultipleMongoses: false + + - + client: + id: &failPointClient failPointClient + useMultipleMongoses: false + + - + database: + id: &database database + client: *client + databaseName: &database_name retryable-writes-tests + - + collection: + id: &collection collection + database: *database + collectionName: &collection_name coll + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: + - { _id: 1, x: 11 } + - { _id: 2, x: 22 } + +tests: + + - + description: 'client.listDatabases retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listDatabases] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: listDatabases + arguments: + filter: {} + expectError: true + + - + description: 'client.listDatabaseNames retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listDatabases] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: listDatabaseNames + expectError: true + + - + description: 'client.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: true + + - + description: 'client.clientBulkWrite retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [bulkWrite] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *client + name: clientBulkWrite + arguments: + models: + - insertOne: + namespace: retryable-writes-tests.coll + document: { _id: 8, x: 88 } + expectError: true + + - + description: 'database.aggregate retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: aggregate + arguments: + pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] + expectError: true + + - + description: 'database.listCollections retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listCollections] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: listCollections + arguments: + filter: {} + expectError: true + + - + description: 'database.listCollectionNames retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listCollections] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: listCollectionNames + arguments: + filter: {} + expectError: true + + - + description: 'database.runCommand retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [ping] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: runCommand + arguments: + command: { ping: 1 } + commandName: ping + expectError: true + + - + description: 'database.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *database + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: true + + - + description: 'collection.aggregate retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: aggregate + arguments: + pipeline: [] + expectError: true + + - + description: 'collection.countDocuments retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: countDocuments + arguments: + filter: {} + expectError: true + + - + description: 'collection.estimatedDocumentCount retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [count] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: estimatedDocumentCount + expectError: true + + - + description: 'collection.distinct retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [distinct] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: distinct + arguments: + fieldName: x + filter: {} + expectError: true + + - + description: 'collection.find retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [find] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: find + arguments: + filter: {} + expectError: true + + - + description: 'collection.findOne retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [find] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOne + arguments: + filter: {} + expectError: true + + - + description: 'collection.listIndexes retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: listIndexes + expectError: true + + - + description: 'collection.listIndexNames retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [listIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: listIndexNames + expectError: true + + - + description: 'collection.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [aggregate] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: createChangeStream + arguments: + pipeline: [] + saveResultAsEntity: changeStream + expectError: true + + - + description: 'collection.insertOne retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: insertOne + arguments: + document: { _id: 2, x: 22 } + expectError: true + + - + description: 'collection.insertMany retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: insertMany + arguments: + documents: + - { _id: 2, x: 22 } + expectError: true + + - + description: 'collection.deleteOne retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [delete] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: deleteOne + arguments: + filter: {} + expectError: true + + - + description: 'collection.deleteMany retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [delete] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: deleteMany + arguments: + filter: {} + expectError: true + + - + description: 'collection.replaceOne retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: replaceOne + arguments: + filter: {} + replacement: { x: 22 } + expectError: true + + - + description: 'collection.updateOne retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: updateOne + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: true + + - + description: 'collection.updateMany retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [update] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: updateMany + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: true + + - + description: 'collection.findOneAndDelete retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndDelete + arguments: + filter: {} + expectError: true + + - + description: 'collection.findOneAndReplace retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndReplace + arguments: + filter: {} + replacement: { x: 22 } + expectError: true + + - + description: 'collection.findOneAndUpdate retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [findAndModify] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: findOneAndUpdate + arguments: + filter: {} + update: { $set: { x: 22 } } + expectError: true + + - + description: 'collection.bulkWrite retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [insert] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: bulkWrite + arguments: + requests: + - insertOne: + document: { _id: 2, x: 22 } + expectError: true + + - + description: 'collection.createIndex retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [createIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: createIndex + arguments: + keys: { x: 11 } + name: "x_11" + expectError: true + + - + description: 'collection.dropIndex retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [dropIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: dropIndex + arguments: + name: "x_11" + expectError: true + + - + description: 'collection.dropIndexes retries at most maxAttempts times (maxAttempts=5)' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [dropIndexes] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *collection + name: dropIndexes + expectError: true diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template new file mode 100644 index 0000000000..bf089211fd --- /dev/null +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -0,0 +1,72 @@ +# Tests in this file are generated from backpressure-retry-max-attempts.yml.template. + +description: tests that operations retry at most maxAttempts=5 times + +schemaVersion: '1.0' + +runOnRequirements: + - + minServerVersion: '4.4' # failCommand + topologies: [replicaset, sharded, load-balanced] + +createEntities: + - + client: + id: &client client + useMultipleMongoses: false + + - + client: + id: &failPointClient failPointClient + useMultipleMongoses: false + + - + database: + id: &database database + client: *client + databaseName: &database_name retryable-writes-tests + - + collection: + id: &collection collection + database: *database + collectionName: &collection_name coll + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: + - { _id: 1, x: 11 } + - { _id: 2, x: 22 } + +tests: +{% for operation in operations %} + - + description: '{{operation.object}}.{{operation.operation_name}} retries at most maxAttempts=5 times' + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 6 } + data: + failCommands: [{{operation.command_name}}] + errorLabels: ["RetryableError", "SystemOverloadedError"] + errorCode: 2 + + - + object: *{{operation.object}} + name: {{operation.operation_name}} + {%- if operation.arguments|length > 0 %} + arguments: + {%- for arg in operation.arguments %} + {{arg}} + {%- endfor -%} + {%- endif %} + {%- if operation.operation_name == "createChangeStream" %} + saveResultAsEntity: changeStream + {%- endif %} + expectError: true +{% endfor -%} diff --git a/source/etc/generate-backpressure-retryability-tests.py b/source/etc/generate-backpressure-retryability-tests.py new file mode 100644 index 0000000000..305cfa585d --- /dev/null +++ b/source/etc/generate-backpressure-retryability-tests.py @@ -0,0 +1,125 @@ +from collections import namedtuple +from jinja2 import Template +import os +import sys + +Operation = namedtuple( + 'Operation', ['operation_name', 'command_name', 'object', 'arguments']) + +CLIENT_BULK_WRITE_ARGUMENTS = '''models: + - insertOne: + namespace: retryable-writes-tests.coll + document: { _id: 8, x: 88 }''' + +CLIENT_OPERATIONS = [ + Operation('listDatabases', 'listDatabases', 'client', ['filter: {}']), + Operation('listDatabaseNames', 'listDatabases', 'client', []), + Operation('createChangeStream', 'aggregate', 'client', ['pipeline: []']), + Operation('clientBulkWrite', 'bulkWrite', 'client', [CLIENT_BULK_WRITE_ARGUMENTS]) +] + +RUN_COMMAND_ARGUMENTS = '''command: { ping: 1 } + commandName: ping''' + +DB_OPERATIONS = [ + Operation('aggregate', 'aggregate', 'database', [ + 'pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ]']), + Operation('listCollections', 'listCollections', + 'database', ['filter: {}']), + Operation('listCollectionNames', 'listCollections', + 'database', ['filter: {}']), # Optional. + Operation('runCommand', 'ping', 'database', [RUN_COMMAND_ARGUMENTS]), + Operation('createChangeStream', 'aggregate', 'database', ['pipeline: []']) +] + +INSERT_MANY_ARGUMENTS = '''documents: + - { _id: 2, x: 22 }''' + +BULK_WRITE_ARGUMENTS = '''requests: + - insertOne: + document: { _id: 2, x: 22 }''' + +COLLECTION_READ_OPERATIONS = [ + Operation('aggregate', 'aggregate', 'collection', ['pipeline: []']), + # Operation('count', 'count', 'collection', ['filter: {}']), # Deprecated. + Operation('countDocuments', 'aggregate', 'collection', ['filter: {}']), + Operation('estimatedDocumentCount', 'count', 'collection', []), + Operation('distinct', 'distinct', 'collection', + ['fieldName: x', 'filter: {}']), + Operation('find', 'find', 'collection', ['filter: {}']), + Operation('findOne', 'find', 'collection', ['filter: {}']), # Optional. + Operation('listIndexes', 'listIndexes', 'collection', []), + Operation('listIndexNames', 'listIndexes', 'collection', []), # Optional. + Operation('createChangeStream', 'aggregate', + 'collection', ['pipeline: []']), +] + +COLLECTION_WRITE_OPERATIONS = [ + Operation('insertOne', 'insert', 'collection', + ['document: { _id: 2, x: 22 }']), + Operation('insertMany', 'insert', 'collection', [INSERT_MANY_ARGUMENTS]), + Operation('deleteOne', 'delete', 'collection', ['filter: {}']), + Operation('deleteMany', 'delete', 'collection', ['filter: {}']), + Operation('replaceOne', 'update', 'collection', [ + 'filter: {}', 'replacement: { x: 22 }']), + Operation('updateOne', 'update', 'collection', [ + 'filter: {}', 'update: { $set: { x: 22 } }']), + Operation('updateMany', 'update', 'collection', [ + 'filter: {}', 'update: { $set: { x: 22 } }']), + Operation('findOneAndDelete', 'findAndModify', + 'collection', ['filter: {}']), + Operation('findOneAndReplace', 'findAndModify', 'collection', + ['filter: {}', 'replacement: { x: 22 }']), + Operation('findOneAndUpdate', 'findAndModify', 'collection', + ['filter: {}', 'update: { $set: { x: 22 } }']), + Operation('bulkWrite', 'insert', 'collection', [BULK_WRITE_ARGUMENTS]), + Operation('createIndex', 'createIndexes', 'collection', + ['keys: { x: 11 }', 'name: "x_11"']), + Operation('dropIndex', 'dropIndexes', 'collection', ['name: "x_11"']), + Operation('dropIndexes', 'dropIndexes', 'collection', []), +] + +COLLECTION_OPERATIONS = COLLECTION_READ_OPERATIONS + COLLECTION_WRITE_OPERATIONS + +# Session and GridFS operations are generally tested in other files, so they're not included in the list of all +# operations. Individual generation functions can choose to include them if needed. +OPERATIONS = CLIENT_OPERATIONS + DB_OPERATIONS + COLLECTION_OPERATIONS + +# ./source/etc +DIR = os.path.dirname(os.path.realpath(__file__)) + + +def get_template(file, templates_dir): + path = f'{templates_dir}/{file}.yml.template' + return Template(open(path, 'r').read()) + + +def write_yaml(file, template, tests_dir, injections): + rendered = template.render(**injections) + path = f'{tests_dir}/{file}.yml' + open(path, 'w').write(rendered) + + +def generate(name, templates_dir, tests_dir, operations): + template = get_template(name, templates_dir) + injections = { + 'operations': operations, + } + write_yaml(name, template, tests_dir, injections) + + +def generate_retry_loop_tests(): + templates_dir = f'{os.path.dirname(DIR)}/client-backpressure/tests' + tests_dir = f'{os.path.dirname(DIR)}/client-backpressure/tests' + generate('backpressure-retry-loop', templates_dir, + tests_dir, OPERATIONS) + + +def generate_max_attempts_tests(): + templates_dir = f'{os.path.dirname(DIR)}/client-backpressure/tests' + tests_dir = f'{os.path.dirname(DIR)}/client-backpressure/tests' + generate('backpressure-retry-max-attempts', templates_dir, + tests_dir, OPERATIONS) + +generate_retry_loop_tests() +generate_max_attempts_tests() \ No newline at end of file From 072b45371972be8d136753ae311a039e3c42d9f0 Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 1 Dec 2025 16:39:22 -0700 Subject: [PATCH 05/55] test fixes and add prose test - add prose test - add assertions on the number of retries for maxAttempts tests - don't run clientBulkWrite tests on <8.0 servers --- source/client-backpressure/tests/README.md | 46 + .../tests/backpressure-retry-loop.json | 3 + .../tests/backpressure-retry-loop.yml | 2 + .../backpressure-retry-loop.yml.template | 4 + .../backpressure-retry-max-attempts.json | 2408 +++++++++++++++-- .../tests/backpressure-retry-max-attempts.yml | 1125 +++++++- ...ckpressure-retry-max-attempts.yml.template | 38 +- 7 files changed, 3400 insertions(+), 226 deletions(-) diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md index 7a70e4ad76..a4e62b9ec1 100644 --- a/source/client-backpressure/tests/README.md +++ b/source/client-backpressure/tests/README.md @@ -10,6 +10,52 @@ retryable reads. These tests utilize the [Unified Test Format](../../unified-tes Several prose tests, which are not easily expressed in YAML, are also presented in this file. Those tests will need to be manually implemented by each driver. +### Prose Tests + +#### Test 1: Operation Retry Uses Exponential Backoff + +Drivers should test that retries do not occur immediately when a SystemOverloadedError is encountered. + +1. let `client` be a `MongoClient` +2. let `collection` be a collection +3. Now, run transactions without backoff: + 1. Configure the random number generator used for jitter to always return `0` -- this effectively disables backoff. + + 2. Configure the following failPoint: + + ```javascript + { + configureFailPoint: 'failCommand', + mode: 'alwaysOn', + data: { + failCommands: ['insert'], + errorCode: 2, + errorLabels: ['SystemOverloadedError', 'RetryableError'] + } + } + ``` + + 3. Execute the following command. Expect that the command errors. Measure the duration of the command execution. + + ```javascript + const start = performance.now(); + expect( + await coll.insertOne({ a: 1 }).catch(e => e) + ).to.be.an.instanceof(MongoServerError); + const end = performance.now(); + ``` + + 4. Configure the random number generator used for jitter to always return `1`. + + 5. Execute step 3 again. + + 6. Compare the two time between the two runs. + ```python + assertTrue(absolute_value(with_backoff_time - (no_backoff_time + 3.1 seconds)) < 1) + ``` + The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two + runs. + ## Changelog - 2025-XX-XX: Initial version. diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 21fc802344..ae944f73ad 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -381,6 +381,9 @@ }, { "description": "client.clientBulkWrite retries using operation loop", + "runOnRequirements": { + "minServerVersion": "8.0" + }, "operations": [ { "object": "utilCollection", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index ec2fa25b8f..c612986233 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -223,6 +223,8 @@ tests: - description: 'client.clientBulkWrite retries using operation loop' + runOnRequirements: + minServerVersion: '8.0' operations: - object: *utilCollection diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index 049cbbac3a..cad83625d3 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -56,6 +56,10 @@ tests: {% for operation in operations %} - description: '{{operation.object}}.{{operation.operation_name}} retries using operation loop' + {%- if ((operation.operation_name == 'clientBulkWrite')) %} + runOnRequirements: + minServerVersion: '8.0' + {%- endif %} operations: - object: *utilCollection diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 36afc55cc5..7e9cc67d7a 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -15,7 +15,12 @@ { "client": { "id": "client", - "useMultipleMongoses": false + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent", + "commandSucceededEvent", + "commandFailedEvent" + ] } }, { @@ -57,7 +62,7 @@ ], "tests": [ { - "description": "client.listDatabases retries at most maxAttempts times (maxAttempts=5)", + "description": "client.listDatabases retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -66,9 +71,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listDatabases" @@ -90,10 +93,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + } + ] + } ] }, { - "description": "client.listDatabaseNames retries at most maxAttempts times (maxAttempts=5)", + "description": "client.listDatabaseNames retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -102,9 +172,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listDatabases" @@ -123,10 +191,77 @@ "name": "listDatabaseNames", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandStartedEvent": { + "commandName": "listDatabases" + } + }, + { + "commandFailedEvent": { + "commandName": "listDatabases" + } + } + ] + } ] }, { - "description": "client.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "description": "client.createChangeStream retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -135,9 +270,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "aggregate" @@ -160,10 +293,80 @@ "saveResultAsEntity": "changeStream", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } ] }, { - "description": "client.clientBulkWrite retries at most maxAttempts times (maxAttempts=5)", + "description": "client.clientBulkWrite retries at most maxAttempts=5 times", + "runOnRequirements": { + "minServerVersion": "8.0" + }, "operations": [ { "name": "failPoint", @@ -172,9 +375,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "bulkWrite" @@ -206,10 +407,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandStartedEvent": { + "commandName": "bulkWrite" + } + }, + { + "commandFailedEvent": { + "commandName": "bulkWrite" + } + } + ] + } ] }, { - "description": "database.aggregate retries at most maxAttempts times (maxAttempts=5)", + "description": "database.aggregate retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -218,9 +486,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "aggregate" @@ -249,10 +515,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } ] }, { - "description": "database.listCollections retries at most maxAttempts times (maxAttempts=5)", + "description": "database.listCollections retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -261,9 +594,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listCollections" @@ -285,10 +616,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + } + ] + } ] }, { - "description": "database.listCollectionNames retries at most maxAttempts times (maxAttempts=5)", + "description": "database.listCollectionNames retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -297,9 +695,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listCollections" @@ -321,10 +717,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + }, + { + "commandStartedEvent": { + "commandName": "listCollections" + } + }, + { + "commandFailedEvent": { + "commandName": "listCollections" + } + } + ] + } ] }, { - "description": "database.runCommand retries at most maxAttempts times (maxAttempts=5)", + "description": "database.runCommand retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -333,9 +796,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "ping" @@ -360,10 +821,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + }, + { + "commandStartedEvent": { + "commandName": "ping" + } + }, + { + "commandFailedEvent": { + "commandName": "ping" + } + } + ] + } ] }, { - "description": "database.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "description": "database.createChangeStream retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -372,9 +900,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "aggregate" @@ -397,10 +923,77 @@ "saveResultAsEntity": "changeStream", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } ] }, { - "description": "collection.aggregate retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.aggregate retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -409,9 +1002,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "aggregate" @@ -433,29 +1024,94 @@ }, "expectError": true } - ] - }, - { - "description": "collection.countDocuments retries at most maxAttempts times (maxAttempts=5)", - "operations": [ - { - "name": "failPoint", - "object": "testRunner", - "arguments": { - "client": "failPointClient", - "failPoint": { - "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, - "data": { - "failCommands": [ - "aggregate" - ], - "errorLabels": [ - "RetryableError", - "SystemOverloadedError" - ], + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } + ] + }, + { + "description": "collection.countDocuments retries at most maxAttempts=5 times", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": "alwaysOn", + "data": { + "failCommands": [ + "aggregate" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], "errorCode": 2 } } @@ -469,10 +1125,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } ] }, { - "description": "collection.estimatedDocumentCount retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.estimatedDocumentCount retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -481,9 +1204,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "count" @@ -502,10 +1223,77 @@ "name": "estimatedDocumentCount", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + }, + { + "commandStartedEvent": { + "commandName": "count" + } + }, + { + "commandFailedEvent": { + "commandName": "count" + } + } + ] + } ] }, { - "description": "collection.distinct retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.distinct retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -514,9 +1302,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "distinct" @@ -539,10 +1325,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + }, + { + "commandStartedEvent": { + "commandName": "distinct" + } + }, + { + "commandFailedEvent": { + "commandName": "distinct" + } + } + ] + } ] }, { - "description": "collection.find retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.find retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -551,9 +1404,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "find" @@ -575,10 +1426,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + } + ] + } ] }, { - "description": "collection.findOne retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.findOne retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -587,9 +1505,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "find" @@ -611,10 +1527,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandFailedEvent": { + "commandName": "find" + } + } + ] + } ] }, { - "description": "collection.listIndexes retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.listIndexes retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -623,9 +1606,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listIndexes" @@ -644,10 +1625,77 @@ "name": "listIndexes", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + } + ] + } ] }, { - "description": "collection.listIndexNames retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.listIndexNames retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -656,9 +1704,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "listIndexes" @@ -677,10 +1723,77 @@ "name": "listIndexNames", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "listIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "listIndexes" + } + } + ] + } ] }, { - "description": "collection.createChangeStream retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.createChangeStream retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -689,9 +1802,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "aggregate" @@ -714,10 +1825,77 @@ "saveResultAsEntity": "changeStream", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + }, + { + "commandStartedEvent": { + "commandName": "aggregate" + } + }, + { + "commandFailedEvent": { + "commandName": "aggregate" + } + } + ] + } ] }, { - "description": "collection.insertOne retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.insertOne retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -726,9 +1904,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "insert" @@ -753,10 +1929,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + } + ] + } ] }, { - "description": "collection.insertMany retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.insertMany retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -765,9 +2008,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "insert" @@ -794,10 +2035,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + } + ] + } ] }, { - "description": "collection.deleteOne retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.deleteOne retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -806,9 +2114,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "delete" @@ -830,10 +2136,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + } + ] + } ] }, { - "description": "collection.deleteMany retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.deleteMany retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -842,9 +2215,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "delete" @@ -866,10 +2237,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + }, + { + "commandStartedEvent": { + "commandName": "delete" + } + }, + { + "commandFailedEvent": { + "commandName": "delete" + } + } + ] + } ] }, { - "description": "collection.replaceOne retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.replaceOne retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -878,9 +2316,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "update" @@ -892,23 +2328,90 @@ "errorCode": 2 } } - } - }, - { - "object": "collection", - "name": "replaceOne", - "arguments": { - "filter": {}, - "replacement": { - "x": 22 - } - }, - "expectError": true + } + }, + { + "object": "collection", + "name": "replaceOne", + "arguments": { + "filter": {}, + "replacement": { + "x": 22 + } + }, + "expectError": true + } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + } + ] } ] }, { - "description": "collection.updateOne retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.updateOne retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -917,9 +2420,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "update" @@ -946,10 +2447,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + } + ] + } ] }, { - "description": "collection.updateMany retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.updateMany retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -958,9 +2526,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "update" @@ -987,10 +2553,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + }, + { + "commandStartedEvent": { + "commandName": "update" + } + }, + { + "commandFailedEvent": { + "commandName": "update" + } + } + ] + } ] }, { - "description": "collection.findOneAndDelete retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.findOneAndDelete retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -999,9 +2632,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "findAndModify" @@ -1023,10 +2654,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + } + ] + } ] }, { - "description": "collection.findOneAndReplace retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.findOneAndReplace retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1035,9 +2733,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "findAndModify" @@ -1062,10 +2758,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + } + ] + } ] }, { - "description": "collection.findOneAndUpdate retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.findOneAndUpdate retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1074,9 +2837,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "findAndModify" @@ -1103,10 +2864,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandStartedEvent": { + "commandName": "findAndModify" + } + }, + { + "commandFailedEvent": { + "commandName": "findAndModify" + } + } + ] + } ] }, { - "description": "collection.bulkWrite retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.bulkWrite retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1115,9 +2943,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "insert" @@ -1148,10 +2974,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandFailedEvent": { + "commandName": "insert" + } + } + ] + } ] }, { - "description": "collection.createIndex retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.createIndex retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1160,9 +3053,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "createIndexes" @@ -1187,10 +3078,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "createIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "createIndexes" + } + } + ] + } ] }, { - "description": "collection.dropIndex retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.dropIndex retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1199,9 +3157,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "dropIndexes" @@ -1223,10 +3179,77 @@ }, "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + } + ] + } ] }, { - "description": "collection.dropIndexes retries at most maxAttempts times (maxAttempts=5)", + "description": "collection.dropIndexes retries at most maxAttempts=5 times", "operations": [ { "name": "failPoint", @@ -1235,9 +3258,7 @@ "client": "failPointClient", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 6 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "dropIndexes" @@ -1256,6 +3277,73 @@ "name": "dropIndexes", "expectError": true } + ], + "expectEvents": [ + { + "client": "client", + "events": [ + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandStartedEvent": { + "commandName": "dropIndexes" + } + }, + { + "commandFailedEvent": { + "commandName": "dropIndexes" + } + } + ] + } ] } ] diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 998b48c78e..9161828e99 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -1,4 +1,4 @@ -# Tests in this file are generated from backpressure-retry-loop.yml.template. +# Tests in this file are generated from backpressure-retry-max-attempts.yml.template. description: tests that operations retry at most maxAttempts=5 times @@ -14,6 +14,7 @@ createEntities: client: id: &client client useMultipleMongoses: false + observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] - client: @@ -42,7 +43,7 @@ initialData: tests: - - description: 'client.listDatabases retries at most maxAttempts times (maxAttempts=5)' + description: 'client.listDatabases retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -50,7 +51,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listDatabases] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -63,8 +64,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + + - - description: 'client.listDatabaseNames retries at most maxAttempts times (maxAttempts=5)' + description: 'client.listDatabaseNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -72,7 +104,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listDatabases] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -83,8 +115,39 @@ tests: name: listDatabaseNames expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + - commandStartedEvent: + commandName: listDatabases + - commandFailedEvent: + commandName: listDatabases + + - - description: 'client.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + description: 'client.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -92,7 +155,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -106,8 +169,41 @@ tests: saveResultAsEntity: changeStream expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'client.clientBulkWrite retries at most maxAttempts times (maxAttempts=5)' + description: 'client.clientBulkWrite retries at most maxAttempts=5 times' + runOnRequirements: + minServerVersion: '8.0' operations: - name: failPoint object: testRunner @@ -115,7 +211,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [bulkWrite] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -131,8 +227,39 @@ tests: document: { _id: 8, x: 88 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + - commandStartedEvent: + commandName: bulkWrite + - commandFailedEvent: + commandName: bulkWrite + + - - description: 'database.aggregate retries at most maxAttempts times (maxAttempts=5)' + description: 'database.aggregate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -140,7 +267,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -153,8 +280,39 @@ tests: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'database.listCollections retries at most maxAttempts times (maxAttempts=5)' + description: 'database.listCollections retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -162,7 +320,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listCollections] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -175,8 +333,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + + - - description: 'database.listCollectionNames retries at most maxAttempts times (maxAttempts=5)' + description: 'database.listCollectionNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -184,7 +373,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listCollections] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -197,8 +386,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + - commandStartedEvent: + commandName: listCollections + - commandFailedEvent: + commandName: listCollections + + - - description: 'database.runCommand retries at most maxAttempts times (maxAttempts=5)' + description: 'database.runCommand retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -206,7 +426,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [ping] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -220,8 +440,39 @@ tests: commandName: ping expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + - commandStartedEvent: + commandName: ping + - commandFailedEvent: + commandName: ping + + - - description: 'database.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + description: 'database.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -229,7 +480,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -243,8 +494,39 @@ tests: saveResultAsEntity: changeStream expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'collection.aggregate retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.aggregate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -252,7 +534,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -265,8 +547,39 @@ tests: pipeline: [] expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'collection.countDocuments retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.countDocuments retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -274,7 +587,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -287,8 +600,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'collection.estimatedDocumentCount retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.estimatedDocumentCount retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -296,7 +640,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [count] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -307,8 +651,39 @@ tests: name: estimatedDocumentCount expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + - commandStartedEvent: + commandName: count + - commandFailedEvent: + commandName: count + + - - description: 'collection.distinct retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.distinct retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -316,7 +691,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [distinct] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -330,8 +705,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + - commandStartedEvent: + commandName: distinct + - commandFailedEvent: + commandName: distinct + + - - description: 'collection.find retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.find retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -339,7 +745,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [find] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -352,8 +758,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + + - - description: 'collection.findOne retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.findOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -361,7 +798,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [find] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -374,8 +811,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandFailedEvent: + commandName: find + + - - description: 'collection.listIndexes retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.listIndexes retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -383,7 +851,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listIndexes] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -394,8 +862,39 @@ tests: name: listIndexes expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + + - - description: 'collection.listIndexNames retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.listIndexNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -403,7 +902,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [listIndexes] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -414,8 +913,39 @@ tests: name: listIndexNames expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + - commandStartedEvent: + commandName: listIndexes + - commandFailedEvent: + commandName: listIndexes + + - - description: 'collection.createChangeStream retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -423,7 +953,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [aggregate] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -437,8 +967,39 @@ tests: saveResultAsEntity: changeStream expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + - commandStartedEvent: + commandName: aggregate + - commandFailedEvent: + commandName: aggregate + + - - description: 'collection.insertOne retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.insertOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -446,7 +1007,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [insert] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -459,8 +1020,39 @@ tests: document: { _id: 2, x: 22 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + + - - description: 'collection.insertMany retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.insertMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -468,7 +1060,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [insert] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -482,8 +1074,39 @@ tests: - { _id: 2, x: 22 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + + - - description: 'collection.deleteOne retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.deleteOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -491,7 +1114,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [delete] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -504,8 +1127,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + + - - description: 'collection.deleteMany retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.deleteMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -513,7 +1167,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [delete] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -526,8 +1180,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + - commandStartedEvent: + commandName: delete + - commandFailedEvent: + commandName: delete + + - - description: 'collection.replaceOne retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.replaceOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -535,7 +1220,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [update] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -549,8 +1234,39 @@ tests: replacement: { x: 22 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + + - - description: 'collection.updateOne retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.updateOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -558,7 +1274,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [update] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -572,8 +1288,39 @@ tests: update: { $set: { x: 22 } } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + + - - description: 'collection.updateMany retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.updateMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -581,7 +1328,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [update] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -595,8 +1342,39 @@ tests: update: { $set: { x: 22 } } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + - commandStartedEvent: + commandName: update + - commandFailedEvent: + commandName: update + + - - description: 'collection.findOneAndDelete retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.findOneAndDelete retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -604,7 +1382,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [findAndModify] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -617,8 +1395,39 @@ tests: filter: {} expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + + - - description: 'collection.findOneAndReplace retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.findOneAndReplace retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -626,7 +1435,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [findAndModify] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -640,8 +1449,39 @@ tests: replacement: { x: 22 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + + - - description: 'collection.findOneAndUpdate retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.findOneAndUpdate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -649,7 +1489,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [findAndModify] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -663,8 +1503,39 @@ tests: update: { $set: { x: 22 } } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + - commandStartedEvent: + commandName: findAndModify + - commandFailedEvent: + commandName: findAndModify + + - - description: 'collection.bulkWrite retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.bulkWrite retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -672,7 +1543,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [insert] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -687,8 +1558,39 @@ tests: document: { _id: 2, x: 22 } expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandFailedEvent: + commandName: insert + + - - description: 'collection.createIndex retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.createIndex retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -696,7 +1598,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [createIndexes] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -710,8 +1612,39 @@ tests: name: "x_11" expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + - commandStartedEvent: + commandName: createIndexes + - commandFailedEvent: + commandName: createIndexes + + - - description: 'collection.dropIndex retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.dropIndex retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -719,7 +1652,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [dropIndexes] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -732,8 +1665,39 @@ tests: name: "x_11" expectError: true + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + + - - description: 'collection.dropIndexes retries at most maxAttempts times (maxAttempts=5)' + description: 'collection.dropIndexes retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -741,7 +1705,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [dropIndexes] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -751,3 +1715,34 @@ tests: object: *collection name: dropIndexes expectError: true + + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + - commandStartedEvent: + commandName: dropIndexes + - commandFailedEvent: + commandName: dropIndexes + diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index bf089211fd..555b90425d 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -14,6 +14,7 @@ createEntities: client: id: &client client useMultipleMongoses: false + observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] - client: @@ -43,6 +44,10 @@ tests: {% for operation in operations %} - description: '{{operation.object}}.{{operation.operation_name}} retries at most maxAttempts=5 times' + {%- if ((operation.operation_name == 'clientBulkWrite')) %} + runOnRequirements: + minServerVersion: '8.0' + {%- endif %} operations: - name: failPoint object: testRunner @@ -50,7 +55,7 @@ tests: client: *failPointClient failPoint: configureFailPoint: failCommand - mode: { times: 6 } + mode: alwaysOn data: failCommands: [{{operation.command_name}}] errorLabels: ["RetryableError", "SystemOverloadedError"] @@ -69,4 +74,35 @@ tests: saveResultAsEntity: changeStream {%- endif %} expectError: true + + expectEvents: + - client: "client" + events: + # we expect 6 pairs of command started and succeeded events: 1 initial + # attempt and 5 retries. + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + - commandStartedEvent: + commandName: {{operation.command_name}} + - commandFailedEvent: + commandName: {{operation.command_name}} + {% endfor -%} From 52e2a352f30f4a7e02b4ecd7f7370b38daab6752 Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 2 Dec 2025 10:12:19 -0700 Subject: [PATCH 06/55] fix run on requirements --- .../tests/backpressure-retry-loop.json | 8 +++++--- .../client-backpressure/tests/backpressure-retry-loop.yml | 2 +- .../tests/backpressure-retry-loop.yml.template | 2 +- .../tests/backpressure-retry-max-attempts.json | 8 +++++--- .../tests/backpressure-retry-max-attempts.yml | 2 +- .../tests/backpressure-retry-max-attempts.yml.template | 2 +- 6 files changed, 14 insertions(+), 10 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index ae944f73ad..749159d4ae 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -381,9 +381,11 @@ }, { "description": "client.clientBulkWrite retries using operation loop", - "runOnRequirements": { - "minServerVersion": "8.0" - }, + "runOnRequirements": [ + { + "minServerVersion": "8.0" + } + ], "operations": [ { "object": "utilCollection", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index c612986233..c069afa6be 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -224,7 +224,7 @@ tests: - description: 'client.clientBulkWrite retries using operation loop' runOnRequirements: - minServerVersion: '8.0' + - minServerVersion: '8.0' # client bulk write added to server in 8.0 operations: - object: *utilCollection diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index cad83625d3..6101d17178 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -58,7 +58,7 @@ tests: description: '{{operation.object}}.{{operation.operation_name}} retries using operation loop' {%- if ((operation.operation_name == 'clientBulkWrite')) %} runOnRequirements: - minServerVersion: '8.0' + - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} operations: - diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 7e9cc67d7a..5cc90248d2 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -364,9 +364,11 @@ }, { "description": "client.clientBulkWrite retries at most maxAttempts=5 times", - "runOnRequirements": { - "minServerVersion": "8.0" - }, + "runOnRequirements": [ + { + "minServerVersion": "8.0" + } + ], "operations": [ { "name": "failPoint", diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 9161828e99..9bbbd74a49 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -203,7 +203,7 @@ tests: - description: 'client.clientBulkWrite retries at most maxAttempts=5 times' runOnRequirements: - minServerVersion: '8.0' + - minServerVersion: '8.0' # client bulk write added to server in 8.0 operations: - name: failPoint object: testRunner diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 555b90425d..342556a6c3 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -46,7 +46,7 @@ tests: description: '{{operation.object}}.{{operation.operation_name}} retries at most maxAttempts=5 times' {%- if ((operation.operation_name == 'clientBulkWrite')) %} runOnRequirements: - minServerVersion: '8.0' + - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} operations: - name: failPoint From 391c95190bb0d1aa286ff2bd863258e43bcd0eef Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 2 Dec 2025 10:29:33 -0700 Subject: [PATCH 07/55] fix run on requirements? --- source/client-backpressure/tests/backpressure-retry-loop.json | 2 +- source/client-backpressure/tests/backpressure-retry-loop.yml | 2 +- .../tests/backpressure-retry-loop.yml.template | 2 +- .../tests/backpressure-retry-max-attempts.json | 2 +- .../tests/backpressure-retry-max-attempts.yml | 2 +- .../tests/backpressure-retry-max-attempts.yml.template | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 749159d4ae..63b3bd5e91 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -1,6 +1,6 @@ { "description": "tests that operations respect overload backoff retry loop", - "schemaVersion": "1.0", + "schemaVersion": "1.3", "runOnRequirements": [ { "minServerVersion": "4.4", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index c069afa6be..8a0ea01766 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -2,7 +2,7 @@ description: tests that operations respect overload backoff retry loop -schemaVersion: '1.0' +schemaVersion: '1.3' runOnRequirements: - diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index 6101d17178..fca19f94f7 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -2,7 +2,7 @@ description: tests that operations respect overload backoff retry loop -schemaVersion: '1.0' +schemaVersion: '1.3' runOnRequirements: - diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 5cc90248d2..206c86fe51 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -1,6 +1,6 @@ { "description": "tests that operations retry at most maxAttempts=5 times", - "schemaVersion": "1.0", + "schemaVersion": "1.3", "runOnRequirements": [ { "minServerVersion": "4.4", diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 9bbbd74a49..edcac5eb87 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -2,7 +2,7 @@ description: tests that operations retry at most maxAttempts=5 times -schemaVersion: '1.0' +schemaVersion: '1.3' runOnRequirements: - diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 342556a6c3..784fc0bb2a 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -2,7 +2,7 @@ description: tests that operations retry at most maxAttempts=5 times -schemaVersion: '1.0' +schemaVersion: '1.3' runOnRequirements: - From 92501c0f0e95c2c0f4d49ac9a6a8548a20a2f4f5 Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 2 Dec 2025 10:42:09 -0700 Subject: [PATCH 08/55] fix CI --- source/client-backpressure/client-backpressure.md | 4 ++++ source/logging/logging.md | 2 +- source/retryable-writes/retryable-writes.md | 2 +- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 88dd94807e..c2bcecc545 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -390,6 +390,10 @@ attempt. Note that the second `CommandStartedEvent` and "command started" log me application from inadvertently retrying for too long (see [Backwards Compatibility](#backwards-compatibility) for details). +### Backwards Compatibility + +TODO + ## Test Plan See the [README](./tests/README.md) for tests. diff --git a/source/logging/logging.md b/source/logging/logging.md index 6f7c36c200..8971dbcde7 100644 --- a/source/logging/logging.md +++ b/source/logging/logging.md @@ -95,7 +95,7 @@ Drivers MUST support configuring where log messages should be output, including > - If the value is "stdout" (case-insensitive), log to stdout. > - If the value is "stderr" (case-insensitive), log to stderr. > - Else, if direct logging to files is supported, log to a file at the specified path. If the file already exists, it - > MUST be appended to. + > MUST be appended to. > > If the variable is not provided or is set to an invalid value (which could be invalid for any reason, e.g. the path > does not exist or is not writeable), the driver MUST log to stderr and the driver MAY attempt to warn the user about diff --git a/source/retryable-writes/retryable-writes.md b/source/retryable-writes/retryable-writes.md index fa613a2908..4a614be89a 100644 --- a/source/retryable-writes/retryable-writes.md +++ b/source/retryable-writes/retryable-writes.md @@ -50,7 +50,7 @@ across drivers. **Retryable Write Error** An error is considered retryable if it has a RetryableWriteError label in its top-level "errorLabels" field. See -[Determining Retryable Write Errors](#determining-retryable-errors) for more information. +[Determining Retryable Write Errors](#determining-retryable-write-errors) for more information. Additional terms may be defined in the [Driver Session](../sessions/driver-sessions.md) specification. From 0fdef39ac06f663ee2f38c41f71c609e856016b5 Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 3 Dec 2025 11:36:46 -0700 Subject: [PATCH 09/55] comments --- source/client-backpressure/client-backpressure.md | 7 +++++++ source/retryable-reads/retryable-reads.md | 7 ++++--- source/retryable-writes/retryable-writes.md | 7 ++++--- 3 files changed, 15 insertions(+), 6 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index c2bcecc545..39613c224d 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -67,6 +67,13 @@ closed"). When a new connection attempt is queued by the server for so long that the driver-side timeout expires, drivers will observe this as a network timeout error. +#### Goodput + +The throughput of positive, useful output. In the context of drivers, this refers to the number of non-error results +that the driver processes per unit of time. + +See [goodput](https://en.wikipedia.org/wiki/Goodput). + ### Requirements for Client Backpressure #### Overload retry policy diff --git a/source/retryable-reads/retryable-reads.md b/source/retryable-reads/retryable-reads.md index a7100cbe92..42046e8b9a 100644 --- a/source/retryable-reads/retryable-reads.md +++ b/source/retryable-reads/retryable-reads.md @@ -15,9 +15,10 @@ This specification will - outline how an API for retryable read operations will be implemented in drivers - define an option to enable retryable reads for an application. -The changes in this specification are related to but distinct from the retryability behaviors defined in the client -backpressure specification, which defines a retryability mechanism for all commands under certain server conditions. -Unless otherwise noted, the changes in this specification refer only to the retryability behaviors summarized above. +The changes in this specification are related to but distinct from the retryability behaviors defined in the +[Client Backpressure Specification](../client-backpressure/client-backpressure.md), which defines a retryability +mechanism for all commands under certain server conditions. Unless otherwise noted, the changes in this specification +refer only to the retryability behaviors summarized above. ## META diff --git a/source/retryable-writes/retryable-writes.md b/source/retryable-writes/retryable-writes.md index 4a614be89a..d8cf74c009 100644 --- a/source/retryable-writes/retryable-writes.md +++ b/source/retryable-writes/retryable-writes.md @@ -19,9 +19,10 @@ specification will outline how an API for retryable write operations will be imp will define an option to enable retryable writes for an application and describe how a transaction ID will be provided to write commands executed therein. -The changes in this specification are related to but distinct from the retryability behaviors defined in the client -backpressure specification, which defines a retryability mechanism for all commands under certain server conditions. -Unless otherwise noted, the changes in this specification refer only to the retryability behaviors summarized above. +The changes in this specification are related to but distinct from the retryability behaviors defined in the +[Client Backpressure Specification](../client-backpressure/client-backpressure.md), which defines a retryability +mechanism for all commands under certain server conditions. Unless otherwise noted, the changes in this specification +refer only to the retryability behaviors summarized above. ## META From 82acab8e2238497392961c1b210ab5f0d0fa86bb Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 3 Dec 2025 11:47:35 -0700 Subject: [PATCH 10/55] Fix broken unified tests --- .../tests/backpressure-retry-loop.json | 96 +++++-------- .../tests/backpressure-retry-loop.yml | 32 ----- .../backpressure-retry-loop.yml.template | 1 - .../backpressure-retry-max-attempts.json | 128 +++++++++++++----- .../tests/backpressure-retry-max-attempts.yml | 96 ++++++++----- ...ckpressure-retry-max-attempts.yml.template | 3 +- 6 files changed, 194 insertions(+), 162 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 63b3bd5e91..b121d9bd3d 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -125,8 +125,7 @@ "name": "listDatabases", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -224,8 +223,7 @@ }, { "object": "client", - "name": "listDatabaseNames", - "expectError": false + "name": "listDatabaseNames" } ], "expectEvents": [ @@ -327,8 +325,7 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", - "expectError": false + "saveResultAsEntity": "changeStream" } ], "expectEvents": [ @@ -444,8 +441,7 @@ } } ] - }, - "expectError": false + } } ], "expectEvents": [ @@ -553,8 +549,7 @@ "$limit": 1 } ] - }, - "expectError": false + } } ], "expectEvents": [ @@ -655,8 +650,7 @@ "name": "listCollections", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -757,8 +751,7 @@ "name": "listCollectionNames", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -862,8 +855,7 @@ "ping": 1 }, "commandName": "ping" - }, - "expectError": false + } } ], "expectEvents": [ @@ -965,8 +957,7 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", - "expectError": false + "saveResultAsEntity": "changeStream" } ], "expectEvents": [ @@ -1067,8 +1058,7 @@ "name": "aggregate", "arguments": { "pipeline": [] - }, - "expectError": false + } } ], "expectEvents": [ @@ -1169,8 +1159,7 @@ "name": "countDocuments", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -1268,8 +1257,7 @@ }, { "object": "collection", - "name": "estimatedDocumentCount", - "expectError": false + "name": "estimatedDocumentCount" } ], "expectEvents": [ @@ -1371,8 +1359,7 @@ "arguments": { "fieldName": "x", "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -1473,8 +1460,7 @@ "name": "find", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -1575,8 +1561,7 @@ "name": "findOne", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -1674,8 +1659,7 @@ }, { "object": "collection", - "name": "listIndexes", - "expectError": false + "name": "listIndexes" } ], "expectEvents": [ @@ -1773,8 +1757,7 @@ }, { "object": "collection", - "name": "listIndexNames", - "expectError": false + "name": "listIndexNames" } ], "expectEvents": [ @@ -1876,8 +1859,7 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", - "expectError": false + "saveResultAsEntity": "changeStream" } ], "expectEvents": [ @@ -1981,8 +1963,7 @@ "_id": 2, "x": 22 } - }, - "expectError": false + } } ], "expectEvents": [ @@ -2088,8 +2069,7 @@ "x": 22 } ] - }, - "expectError": false + } } ], "expectEvents": [ @@ -2190,8 +2170,7 @@ "name": "deleteOne", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -2292,8 +2271,7 @@ "name": "deleteMany", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -2397,8 +2375,7 @@ "replacement": { "x": 22 } - }, - "expectError": false + } } ], "expectEvents": [ @@ -2504,8 +2481,7 @@ "x": 22 } } - }, - "expectError": false + } } ], "expectEvents": [ @@ -2611,8 +2587,7 @@ "x": 22 } } - }, - "expectError": false + } } ], "expectEvents": [ @@ -2713,8 +2688,7 @@ "name": "findOneAndDelete", "arguments": { "filter": {} - }, - "expectError": false + } } ], "expectEvents": [ @@ -2818,8 +2792,7 @@ "replacement": { "x": 22 } - }, - "expectError": false + } } ], "expectEvents": [ @@ -2925,8 +2898,7 @@ "x": 22 } } - }, - "expectError": false + } } ], "expectEvents": [ @@ -3036,8 +3008,7 @@ } } ] - }, - "expectError": false + } } ], "expectEvents": [ @@ -3141,8 +3112,7 @@ "x": 11 }, "name": "x_11" - }, - "expectError": false + } } ], "expectEvents": [ @@ -3253,8 +3223,7 @@ "name": "dropIndex", "arguments": { "name": "x_11" - }, - "expectError": false + } } ], "expectEvents": [ @@ -3352,8 +3321,7 @@ }, { "object": "collection", - "name": "dropIndexes", - "expectError": false + "name": "dropIndexes" } ], "expectEvents": [ diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 8a0ea01766..08d6072edd 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -88,7 +88,6 @@ tests: name: listDatabases arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -142,7 +141,6 @@ tests: - object: *client name: listDatabaseNames - expectError: false expectEvents: - client: "client" @@ -199,7 +197,6 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: false expectEvents: - client: "client" @@ -260,7 +257,6 @@ tests: - insertOne: namespace: retryable-writes-tests.coll document: { _id: 8, x: 88 } - expectError: false expectEvents: - client: "client" @@ -316,7 +312,6 @@ tests: name: aggregate arguments: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] - expectError: false expectEvents: - client: "client" @@ -372,7 +367,6 @@ tests: name: listCollections arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -428,7 +422,6 @@ tests: name: listCollectionNames arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -485,7 +478,6 @@ tests: arguments: command: { ping: 1 } commandName: ping - expectError: false expectEvents: - client: "client" @@ -542,7 +534,6 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: false expectEvents: - client: "client" @@ -598,7 +589,6 @@ tests: name: aggregate arguments: pipeline: [] - expectError: false expectEvents: - client: "client" @@ -654,7 +644,6 @@ tests: name: countDocuments arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -708,7 +697,6 @@ tests: - object: *collection name: estimatedDocumentCount - expectError: false expectEvents: - client: "client" @@ -765,7 +753,6 @@ tests: arguments: fieldName: x filter: {} - expectError: false expectEvents: - client: "client" @@ -821,7 +808,6 @@ tests: name: find arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -877,7 +863,6 @@ tests: name: findOne arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -931,7 +916,6 @@ tests: - object: *collection name: listIndexes - expectError: false expectEvents: - client: "client" @@ -985,7 +969,6 @@ tests: - object: *collection name: listIndexNames - expectError: false expectEvents: - client: "client" @@ -1042,7 +1025,6 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: false expectEvents: - client: "client" @@ -1098,7 +1080,6 @@ tests: name: insertOne arguments: document: { _id: 2, x: 22 } - expectError: false expectEvents: - client: "client" @@ -1155,7 +1136,6 @@ tests: arguments: documents: - { _id: 2, x: 22 } - expectError: false expectEvents: - client: "client" @@ -1211,7 +1191,6 @@ tests: name: deleteOne arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -1267,7 +1246,6 @@ tests: name: deleteMany arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -1324,7 +1302,6 @@ tests: arguments: filter: {} replacement: { x: 22 } - expectError: false expectEvents: - client: "client" @@ -1381,7 +1358,6 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: false expectEvents: - client: "client" @@ -1438,7 +1414,6 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: false expectEvents: - client: "client" @@ -1494,7 +1469,6 @@ tests: name: findOneAndDelete arguments: filter: {} - expectError: false expectEvents: - client: "client" @@ -1551,7 +1525,6 @@ tests: arguments: filter: {} replacement: { x: 22 } - expectError: false expectEvents: - client: "client" @@ -1608,7 +1581,6 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: false expectEvents: - client: "client" @@ -1666,7 +1638,6 @@ tests: requests: - insertOne: document: { _id: 2, x: 22 } - expectError: false expectEvents: - client: "client" @@ -1723,7 +1694,6 @@ tests: arguments: keys: { x: 11 } name: "x_11" - expectError: false expectEvents: - client: "client" @@ -1785,7 +1755,6 @@ tests: name: dropIndex arguments: name: "x_11" - expectError: false expectEvents: - client: "client" @@ -1839,7 +1808,6 @@ tests: - object: *collection name: dropIndexes - expectError: false expectEvents: - client: "client" diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index fca19f94f7..db3ce498e9 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -108,7 +108,6 @@ tests: {%- if operation.operation_name == "createChangeStream" %} saveResultAsEntity: changeStream {%- endif %} - expectError: false expectEvents: - client: "client" diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 206c86fe51..d11ad9c5c5 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -91,7 +91,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -189,7 +191,9 @@ { "object": "client", "name": "listDatabaseNames", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -291,7 +295,9 @@ "pipeline": [] }, "saveResultAsEntity": "changeStream", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -407,7 +413,9 @@ } ] }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -515,7 +523,9 @@ } ] }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -616,7 +626,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -717,7 +729,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -821,7 +835,9 @@ }, "commandName": "ping" }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -923,7 +939,9 @@ "pipeline": [] }, "saveResultAsEntity": "changeStream", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1024,7 +1042,9 @@ "arguments": { "pipeline": [] }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1125,7 +1145,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1223,7 +1245,9 @@ { "object": "collection", "name": "estimatedDocumentCount", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1325,7 +1349,9 @@ "fieldName": "x", "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1426,7 +1452,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1527,7 +1555,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1625,7 +1655,9 @@ { "object": "collection", "name": "listIndexes", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1723,7 +1755,9 @@ { "object": "collection", "name": "listIndexNames", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1825,7 +1859,9 @@ "pipeline": [] }, "saveResultAsEntity": "changeStream", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -1929,7 +1965,9 @@ "x": 22 } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2035,7 +2073,9 @@ } ] }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2136,7 +2176,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2237,7 +2279,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2341,7 +2385,9 @@ "x": 22 } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2447,7 +2493,9 @@ } } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2553,7 +2601,9 @@ } } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2654,7 +2704,9 @@ "arguments": { "filter": {} }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2758,7 +2810,9 @@ "x": 22 } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2864,7 +2918,9 @@ } } }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -2974,7 +3030,9 @@ } ] }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -3078,7 +3136,9 @@ }, "name": "x_11" }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -3179,7 +3239,9 @@ "arguments": { "name": "x_11" }, - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -3277,7 +3339,9 @@ { "object": "collection", "name": "dropIndexes", - "expectError": true + "expectError": { + "isError": true + } } ], "expectEvents": [ diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index edcac5eb87..72c9b446b7 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -62,7 +62,8 @@ tests: name: listDatabases arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -113,7 +114,8 @@ tests: - object: *client name: listDatabaseNames - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -167,7 +169,8 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -225,7 +228,8 @@ tests: - insertOne: namespace: retryable-writes-tests.coll document: { _id: 8, x: 88 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -278,7 +282,8 @@ tests: name: aggregate arguments: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -331,7 +336,8 @@ tests: name: listCollections arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -384,7 +390,8 @@ tests: name: listCollectionNames arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -438,7 +445,8 @@ tests: arguments: command: { ping: 1 } commandName: ping - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -492,7 +500,8 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -545,7 +554,8 @@ tests: name: aggregate arguments: pipeline: [] - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -598,7 +608,8 @@ tests: name: countDocuments arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -649,7 +660,8 @@ tests: - object: *collection name: estimatedDocumentCount - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -703,7 +715,8 @@ tests: arguments: fieldName: x filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -756,7 +769,8 @@ tests: name: find arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -809,7 +823,8 @@ tests: name: findOne arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -860,7 +875,8 @@ tests: - object: *collection name: listIndexes - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -911,7 +927,8 @@ tests: - object: *collection name: listIndexNames - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -965,7 +982,8 @@ tests: arguments: pipeline: [] saveResultAsEntity: changeStream - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1018,7 +1036,8 @@ tests: name: insertOne arguments: document: { _id: 2, x: 22 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1072,7 +1091,8 @@ tests: arguments: documents: - { _id: 2, x: 22 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1125,7 +1145,8 @@ tests: name: deleteOne arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1178,7 +1199,8 @@ tests: name: deleteMany arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1232,7 +1254,8 @@ tests: arguments: filter: {} replacement: { x: 22 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1286,7 +1309,8 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1340,7 +1364,8 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1393,7 +1418,8 @@ tests: name: findOneAndDelete arguments: filter: {} - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1447,7 +1473,8 @@ tests: arguments: filter: {} replacement: { x: 22 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1501,7 +1528,8 @@ tests: arguments: filter: {} update: { $set: { x: 22 } } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1556,7 +1584,8 @@ tests: requests: - insertOne: document: { _id: 2, x: 22 } - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1610,7 +1639,8 @@ tests: arguments: keys: { x: 11 } name: "x_11" - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1663,7 +1693,8 @@ tests: name: dropIndex arguments: name: "x_11" - expectError: true + expectError: + isError: true expectEvents: - client: "client" @@ -1714,7 +1745,8 @@ tests: - object: *collection name: dropIndexes - expectError: true + expectError: + isError: true expectEvents: - client: "client" diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 784fc0bb2a..2920a4064f 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -73,7 +73,8 @@ tests: {%- if operation.operation_name == "createChangeStream" %} saveResultAsEntity: changeStream {%- endif %} - expectError: true + expectError: + isError: true expectEvents: - client: "client" From b3a7b6c77daa69b01667a8a13cf44cf8e11b41d0 Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 3 Dec 2025 11:58:29 -0700 Subject: [PATCH 11/55] fix UTR linting failures --- .../tests/backpressure-retry-max-attempts.json | 3 --- .../tests/backpressure-retry-max-attempts.yml | 3 --- .../tests/backpressure-retry-max-attempts.yml.template | 3 --- 3 files changed, 9 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index d11ad9c5c5..a499aa490b 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -294,7 +294,6 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", "expectError": { "isError": true } @@ -938,7 +937,6 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", "expectError": { "isError": true } @@ -1858,7 +1856,6 @@ "arguments": { "pipeline": [] }, - "saveResultAsEntity": "changeStream", "expectError": { "isError": true } diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 72c9b446b7..3bd4582757 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -168,7 +168,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectError: isError: true @@ -499,7 +498,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectError: isError: true @@ -981,7 +979,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectError: isError: true diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 2920a4064f..4f2cfeee47 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -70,9 +70,6 @@ tests: {{arg}} {%- endfor -%} {%- endif %} - {%- if operation.operation_name == "createChangeStream" %} - saveResultAsEntity: changeStream - {%- endif %} expectError: isError: true From 60a87b8b2f0d315b822e55c66ed84d413c5fe51d Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 3 Dec 2025 15:22:28 -0700 Subject: [PATCH 12/55] remove broken deleteMany() from unified tests --- .../tests/backpressure-retry-loop.json | 384 ------------------ .../tests/backpressure-retry-loop.yml | 286 ++----------- .../backpressure-retry-loop.yml.template | 7 - 3 files changed, 31 insertions(+), 646 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index b121d9bd3d..20bdfe3a69 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -85,18 +85,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -186,18 +174,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -284,18 +260,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -391,18 +355,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -502,18 +454,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -610,18 +550,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -711,18 +639,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -812,18 +728,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -916,18 +820,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1018,18 +910,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1119,18 +999,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1220,18 +1088,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1318,18 +1174,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1420,18 +1264,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1521,18 +1353,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1622,18 +1442,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1720,18 +1528,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1818,18 +1614,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -1920,18 +1704,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2024,18 +1796,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2130,18 +1890,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2231,18 +1979,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2332,18 +2068,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2436,18 +2160,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2542,18 +2254,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2648,18 +2348,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2749,18 +2437,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2853,18 +2529,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -2959,18 +2623,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -3069,18 +2721,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", @@ -3173,18 +2813,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "object": "utilCollection", "name": "createIndex", @@ -3284,18 +2912,6 @@ "filter": {} } }, - { - "object": "utilCollection", - "name": "deleteMany", - "arguments": { - "documents": [ - { - "_id": 1, - "x": 11 - } - ] - } - }, { "name": "failPoint", "object": "testRunner", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 08d6072edd..a6498568f1 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -61,14 +61,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -116,14 +109,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -169,14 +155,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -227,14 +206,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -285,14 +257,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -340,14 +305,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -395,14 +353,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -450,14 +401,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -506,14 +450,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -562,14 +499,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -617,14 +547,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -672,14 +595,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -725,14 +641,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -781,14 +690,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -836,14 +738,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -891,14 +786,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -944,14 +832,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -997,14 +878,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1053,14 +927,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1108,14 +975,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1164,14 +1024,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1219,14 +1072,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1274,14 +1120,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1330,14 +1169,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1386,14 +1218,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1442,14 +1267,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1497,14 +1315,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1553,14 +1364,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1609,14 +1413,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1666,14 +1463,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint @@ -1723,13 +1513,6 @@ tests: name: deleteMany arguments: filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } - object: *utilCollection name: createIndex @@ -1783,14 +1566,7 @@ tests: object: *utilCollection name: deleteMany arguments: - filter: {} - - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } + filter: {} - name: failPoint diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index db3ce498e9..53b8ba3cf6 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -67,13 +67,6 @@ tests: arguments: filter: {} - - - object: *utilCollection - name: deleteMany - arguments: - documents: - - { _id: 1, x: 11 } - {%- if operation.operation_name == "dropIndex" %} - object: *utilCollection From 399a56b72fd17a1948000bbefb548b9097cbf61c Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 10 Dec 2025 10:34:20 -0700 Subject: [PATCH 13/55] add backwards compat section --- source/client-backpressure/client-backpressure.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 39613c224d..9865546c4e 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -399,7 +399,18 @@ attempt. Note that the second `CommandStartedEvent` and "command started" log me ### Backwards Compatibility -TODO +The server's rate limiting can introduce higher error rates than previously would have been exposed to users under +periods of extreme server overload. The increased error rates is a tradeoff: given the choice between an overloaded +server (potential crash), or at minimum dramatically slower query execution time and a stable but lowered throughput +with higher error rate as the server load sheds, we have chosen the latter. + +The changes in this specification help smooth out the impact of the server's rate limiting on users by reducing the +number of errors users see during spikes or burst workloads and help prevent retry storms by spacing out retries. +However, older drivers do not have this benefit. Drivers MUST document that: + +- Users SHOULD upgrade to driver versions that officially support backpressure to avoid any impacts of server changes. +- Users who do not upgrade might see increased might need to update application error handling to handle higher error + rates of SystemOverloadedErrors. ## Test Plan From 0545e15f47540c9466689317372a22fd9b947b7d Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 11 Dec 2025 11:24:28 -0700 Subject: [PATCH 14/55] Jeff's and Jib's comments --- source/client-backpressure/client-backpressure.md | 15 +++++++-------- source/client-backpressure/tests/README.md | 2 +- 2 files changed, 8 insertions(+), 9 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 9865546c4e..0f9f431f78 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -92,14 +92,15 @@ the following rules: - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. -5. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according - to according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` +5. If a retry attempt is to be attempted, a token will be consumed from the token bucket. +6. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according + to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` - `i` is the retry attempt (starting with 0 for the first retry). - `j` is a random jitter value between 0 and 1. - `baseBackoff` is constant 100ms. - `maxBackoff` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. -6. If the previous error contained the `SystemOverloadedError` error label, the node will be added to the set of +7. If the previous error contained the `SystemOverloadedError` error label, the node will be added to the set of deprioritized servers. #### Pseudocode @@ -213,6 +214,8 @@ async function tryOperation= 2.1) ``` The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two runs. From ff5475aada6a6c10ca15337ef9cc1b241899fd81 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 11 Dec 2025 13:19:57 -0700 Subject: [PATCH 15/55] adjust backpressure spec phrasing to make it clear retryable errors are handled separately --- .../client-backpressure.md | 74 +++++++++++++------ 1 file changed, 51 insertions(+), 23 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 0f9f431f78..1c0ba0a999 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -40,6 +40,9 @@ the connection and request rate limiters to prevent and mitigate overloading the An error is considered retryable if it includes the "RetryableError" label. This error label indicates that an operation is safely retryable regardless of the type of operation, its metadata, or any of its arguments. +Note that for the initial draft of the spec, only errors that have both the RetryableError label and the +SystemOverloadedError label are eligible for the retry backoff loop. + #### SystemOverloadedError label An error is considered overloaded if it includes the "SystemOverloadError" label. This error label indicates that the @@ -67,6 +70,9 @@ closed"). When a new connection attempt is queued by the server for so long that the driver-side timeout expires, drivers will observe this as a network timeout error. +Note that there is no guarantee that all SystemOverloaded errors are retryable or that all RetryableErrors also have the +SystemOverloaded error label. + #### Goodput The throughput of positive, useful output. In the context of drivers, this refers to the number of non-error results @@ -78,30 +84,38 @@ See [goodput](https://en.wikipedia.org/wiki/Goodput). #### Overload retry policy -This specification expands the driver's retry ability to all commands, including those not currently considered -retryable such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys -the following rules: +This specification expands the driver's retry ability to all commands if the error indicates that is both an overload +error and that it is retryable, including those not currently considered retryable such as updateMany, create +collection, getMore, and generic runCommand. The new command execution method obeys the following rules: 1. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. - The value is 0.1 and non-configurable. 2. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. 3. If a retry attempt fails with an error that does not include `SystemOverloadedError` label, drivers MUST deposit 1 token. -4. A retry attempt will only be permitted if the error includes the `RetryableError` label, we have not reached - `MAX_ATTEMPTS`, the CSOT deadline has not expired, and a token can be acquired from the token bucket. + - A non-SystemOverloaded error indicates that the server is healthy enough to handle requests. For the purposes of + retry budget tracking, this counts as a success. +4. A retry attempt will only be permitted if the error includes the `RetryableError` label, the error has a + `SystemOverloadedError` label, we have not reached `MAX_ATTEMPTS`, the CSOT deadline has not expired, and a token + can be acquired from the token bucket. - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. + - Note: Future work will add support for RetryableErrors to regular retryability logic (see the future work section). 5. If a retry attempt is to be attempted, a token will be consumed from the token bucket. -6. If the previous error includes the `SystemOverloadedError` label, the client MUST apply exponential backoff according - to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` - - `i` is the retry attempt (starting with 0 for the first retry). +6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to + the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` + - `i` is the retry attempt number (starting with 0 for the first retry). - `j` is a random jitter value between 0 and 1. - `baseBackoff` is constant 100ms. - `maxBackoff` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. -7. If the previous error contained the `SystemOverloadedError` error label, the node will be added to the set of - deprioritized servers. +7. If the request is eligible for retry (as outlined in step 4), the client MUST add the server's address to the list of + deprioritized server address for server selection. This behavior is the same as existing behavior for retryable + reads and writes. + +Note: drivers MUST share deprioritized servers between retries used for the exponential backoff loop and regular +retryable reads and writes. #### Pseudocode @@ -111,6 +125,7 @@ The following pseudocode describes the overload retry policy: BASE_BACKOFF = 0.1 MAX_BACKOFF = 10 RETRY_TOKEN_RETURN_RATE = 0.1 +MAX_ATTEMPTS = 5 def execute_command_retryable(command, ...): deprioritized_servers = [] @@ -127,19 +142,21 @@ def execute_command_retryable(command, ...): token_bucket.deposit(tokens) return res except PyMongoError as exc: - backoff = 0 + # if a retry fails with a non-System overloaded error, deposit 1 token + if attempt > 0 and not exc.has_error_label("SystemOverloadedError"): + tokens += 1 + attempt += 1 - if attempt > MAX_ATTEMPTS: + if attempt >= MAX_ATTEMPTS: raise - # Raise if the error is non retryable. - is_retryable = exc.has_error_label("RetryableError") or is_retryable_write_error() or is_retryable_read_error() - if not is_retryable: - raise error - if exc.has_error_label("SystemOverloadedError"): - jitter = random.random() # Random float between [0.0, 1.0). - backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) + # Raise if the error if non retryable. + if exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError"): + raise + + jitter = random.random() # Random float between [0.0, 1.0). + backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) # If the delay exceeds the deadline, bail early before consuming a token. if _csot.get_timeout(): @@ -151,6 +168,7 @@ def execute_command_retryable(command, ...): if backoff: time.sleep(backoff) + deprioritized_servers.append(server) continue ``` @@ -161,6 +179,7 @@ that demonstrates a combined retryable reads/writes implementation with the corr from the Node driver's implementation): ```typescript +// TODO: update pseudocode with updated implementation async function tryOperation>( operation: T, { topology, timeoutContext, session, readPreference }: RetryOptions @@ -327,10 +346,10 @@ async function tryOperation Date: Thu, 11 Dec 2025 13:51:21 -0700 Subject: [PATCH 16/55] squash: jeremy's casing comments --- .../tests/backpressure-retry-loop.json | 156 ++++++------- .../tests/backpressure-retry-loop.yml | 217 +++++++++--------- .../backpressure-retry-loop.yml.template | 31 +-- 3 files changed, 187 insertions(+), 217 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 20bdfe3a69..4ce96b2c28 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -25,21 +25,21 @@ }, { "client": { - "id": "failPointClient", + "id": "internal_client", "useMultipleMongoses": false } }, { "database": { - "id": "utilDb", - "client": "failPointClient", + "id": "database", + "client": "internal_client", "databaseName": "retryable-writes-tests" } }, { "collection": { - "id": "utilCollection", - "database": "utilDb", + "id": "retryable-writes-tests", + "database": "database", "collectionName": "coll" } }, @@ -58,28 +58,12 @@ } } ], - "initialData": [ - { - "collectionName": "coll", - "databaseName": "retryable-writes-tests", - "documents": [ - { - "_id": 1, - "x": 11 - }, - { - "_id": 2, - "x": 22 - } - ] - } - ], "tests": [ { "description": "client.listDatabases retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -89,7 +73,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -168,7 +152,7 @@ "description": "client.listDatabaseNames retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -178,7 +162,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -254,7 +238,7 @@ "description": "client.createChangeStream retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -264,7 +248,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -349,7 +333,7 @@ ], "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -359,7 +343,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -448,7 +432,7 @@ "description": "database.aggregate retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -458,7 +442,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -544,7 +528,7 @@ "description": "database.listCollections retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -554,7 +538,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -633,7 +617,7 @@ "description": "database.listCollectionNames retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -643,7 +627,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -722,7 +706,7 @@ "description": "database.runCommand retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -732,7 +716,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -814,7 +798,7 @@ "description": "database.createChangeStream retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -824,7 +808,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -904,7 +888,7 @@ "description": "collection.aggregate retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -914,7 +898,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -993,7 +977,7 @@ "description": "collection.countDocuments retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1003,7 +987,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1082,7 +1066,7 @@ "description": "collection.estimatedDocumentCount retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1092,7 +1076,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1168,7 +1152,7 @@ "description": "collection.distinct retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1178,7 +1162,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1258,7 +1242,7 @@ "description": "collection.find retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1268,7 +1252,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1347,7 +1331,7 @@ "description": "collection.findOne retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1357,7 +1341,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1436,7 +1420,7 @@ "description": "collection.listIndexes retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1446,7 +1430,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1522,7 +1506,7 @@ "description": "collection.listIndexNames retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1532,7 +1516,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1608,7 +1592,7 @@ "description": "collection.createChangeStream retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1618,7 +1602,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1698,7 +1682,7 @@ "description": "collection.insertOne retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1708,7 +1692,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1790,7 +1774,7 @@ "description": "collection.insertMany retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1800,7 +1784,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1884,7 +1868,7 @@ "description": "collection.deleteOne retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1894,7 +1878,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -1973,7 +1957,7 @@ "description": "collection.deleteMany retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -1983,7 +1967,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2062,7 +2046,7 @@ "description": "collection.replaceOne retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2072,7 +2056,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2154,7 +2138,7 @@ "description": "collection.updateOne retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2164,7 +2148,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2248,7 +2232,7 @@ "description": "collection.updateMany retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2258,7 +2242,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2342,7 +2326,7 @@ "description": "collection.findOneAndDelete retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2352,7 +2336,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2431,7 +2415,7 @@ "description": "collection.findOneAndReplace retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2441,7 +2425,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2523,7 +2507,7 @@ "description": "collection.findOneAndUpdate retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2533,7 +2517,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2617,7 +2601,7 @@ "description": "collection.bulkWrite retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2627,7 +2611,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2715,7 +2699,7 @@ "description": "collection.createIndex retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2725,7 +2709,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2807,14 +2791,14 @@ "description": "collection.dropIndex retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} } }, { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "createIndex", "arguments": { "keys": { @@ -2827,7 +2811,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { @@ -2906,7 +2890,7 @@ "description": "collection.dropIndexes retries using operation loop", "operations": [ { - "object": "utilCollection", + "object": "retryable-writes-tests", "name": "deleteMany", "arguments": { "filter": {} @@ -2916,7 +2900,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "internal_client", "failPoint": { "configureFailPoint": "failCommand", "mode": { diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index a6498568f1..047e2b61ff 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -14,23 +14,23 @@ createEntities: client: id: &client client useMultipleMongoses: false - observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - client: - id: &failPointClient failPointClient + id: &internal_client internal_client useMultipleMongoses: false - database: - id: &utilDb utilDb - client: *failPointClient + id: &internal_db database + client: *internal_client databaseName: &database_name retryable-writes-tests - collection: - id: &utilCollection utilCollection - database: *utilDb + id: &internal_collection retryable-writes-tests + database: *internal_db collectionName: &collection_name coll - @@ -38,19 +38,12 @@ createEntities: id: &database database client: *client databaseName: &database_name retryable-writes-tests + - collection: id: &collection collection database: *database - collectionName: &collection_name coll - -initialData: - - - collectionName: *collection_name - databaseName: *database_name - documents: - - { _id: 1, x: 11 } - - { _id: 2, x: 22 } + collectionName: *collection_name tests: @@ -58,7 +51,7 @@ tests: description: 'client.listDatabases retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -67,13 +60,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listDatabases] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -106,7 +99,7 @@ tests: description: 'client.listDatabaseNames retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -115,13 +108,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listDatabases] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -152,7 +145,7 @@ tests: description: 'client.createChangeStream retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -161,13 +154,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -203,7 +196,7 @@ tests: - minServerVersion: '8.0' # client bulk write added to server in 8.0 operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -212,13 +205,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [bulkWrite] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -254,7 +247,7 @@ tests: description: 'database.aggregate retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -263,13 +256,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -302,7 +295,7 @@ tests: description: 'database.listCollections retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -311,13 +304,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listCollections] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -350,7 +343,7 @@ tests: description: 'database.listCollectionNames retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -359,13 +352,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listCollections] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -398,7 +391,7 @@ tests: description: 'database.runCommand retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -407,13 +400,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [ping] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -447,7 +440,7 @@ tests: description: 'database.createChangeStream retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -456,13 +449,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -496,7 +489,7 @@ tests: description: 'collection.aggregate retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -505,13 +498,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -544,7 +537,7 @@ tests: description: 'collection.countDocuments retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -553,13 +546,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -592,7 +585,7 @@ tests: description: 'collection.estimatedDocumentCount retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -601,13 +594,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [count] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -638,7 +631,7 @@ tests: description: 'collection.distinct retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -647,13 +640,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [distinct] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -687,7 +680,7 @@ tests: description: 'collection.find retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -696,13 +689,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [find] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -735,7 +728,7 @@ tests: description: 'collection.findOne retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -744,13 +737,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [find] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -783,7 +776,7 @@ tests: description: 'collection.listIndexes retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -792,13 +785,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -829,7 +822,7 @@ tests: description: 'collection.listIndexNames retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -838,13 +831,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [listIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -875,7 +868,7 @@ tests: description: 'collection.createChangeStream retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -884,13 +877,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -924,7 +917,7 @@ tests: description: 'collection.insertOne retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -933,13 +926,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -972,7 +965,7 @@ tests: description: 'collection.insertMany retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -981,13 +974,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1021,7 +1014,7 @@ tests: description: 'collection.deleteOne retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1030,13 +1023,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [delete] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1069,7 +1062,7 @@ tests: description: 'collection.deleteMany retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1078,13 +1071,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [delete] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1117,7 +1110,7 @@ tests: description: 'collection.replaceOne retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1126,13 +1119,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1166,7 +1159,7 @@ tests: description: 'collection.updateOne retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1175,13 +1168,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1215,7 +1208,7 @@ tests: description: 'collection.updateMany retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1224,13 +1217,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1264,7 +1257,7 @@ tests: description: 'collection.findOneAndDelete retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1273,13 +1266,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1312,7 +1305,7 @@ tests: description: 'collection.findOneAndReplace retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1321,13 +1314,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1361,7 +1354,7 @@ tests: description: 'collection.findOneAndUpdate retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1370,13 +1363,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1410,7 +1403,7 @@ tests: description: 'collection.bulkWrite retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1419,13 +1412,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1460,7 +1453,7 @@ tests: description: 'collection.createIndex retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1469,13 +1462,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [createIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1509,12 +1502,12 @@ tests: description: 'collection.dropIndex retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} - - object: *utilCollection + object: *internal_collection name: createIndex arguments: keys: { x: 11 } @@ -1524,13 +1517,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [dropIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1563,7 +1556,7 @@ tests: description: 'collection.dropIndexes retries using operation loop' operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} @@ -1572,13 +1565,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [dropIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index 53b8ba3cf6..d2f26291d3 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -14,23 +14,23 @@ createEntities: client: id: &client client useMultipleMongoses: false - observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - client: - id: &failPointClient failPointClient + id: &internal_client internal_client useMultipleMongoses: false - database: - id: &utilDb utilDb - client: *failPointClient + id: &internal_db database + client: *internal_client databaseName: &database_name retryable-writes-tests - collection: - id: &utilCollection utilCollection - database: *utilDb + id: &internal_collection retryable-writes-tests + database: *internal_db collectionName: &collection_name coll - @@ -38,19 +38,12 @@ createEntities: id: &database database client: *client databaseName: &database_name retryable-writes-tests + - collection: id: &collection collection database: *database - collectionName: &collection_name coll - -initialData: - - - collectionName: *collection_name - databaseName: *database_name - documents: - - { _id: 1, x: 11 } - - { _id: 2, x: 22 } + collectionName: *collection_name tests: {% for operation in operations %} @@ -62,14 +55,14 @@ tests: {%- endif %} operations: - - object: *utilCollection + object: *internal_collection name: deleteMany arguments: filter: {} {%- if operation.operation_name == "dropIndex" %} - - object: *utilCollection + object: *internal_collection name: createIndex arguments: keys: { x: 11 } @@ -80,13 +73,13 @@ tests: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *internal_client failPoint: configureFailPoint: failCommand mode: { times: 3 } data: failCommands: [{{operation.command_name}}] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - From c1001bcd60c5ba0c10c4a859f5928d7319f8fb47 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 11 Dec 2025 13:53:59 -0700 Subject: [PATCH 17/55] squash: other comments --- .../tests/backpressure-retry-loop.json | 9 +++------ .../tests/backpressure-retry-loop.yml | 3 --- .../tests/backpressure-retry-loop.yml.template | 3 --- 3 files changed, 3 insertions(+), 12 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 4ce96b2c28..79cfc4bac7 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -272,8 +272,7 @@ "name": "createChangeStream", "arguments": { "pipeline": [] - }, - "saveResultAsEntity": "changeStream" + } } ], "expectEvents": [ @@ -832,8 +831,7 @@ "name": "createChangeStream", "arguments": { "pipeline": [] - }, - "saveResultAsEntity": "changeStream" + } } ], "expectEvents": [ @@ -1626,8 +1624,7 @@ "name": "createChangeStream", "arguments": { "pipeline": [] - }, - "saveResultAsEntity": "changeStream" + } } ], "expectEvents": [ diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 047e2b61ff..538ee413e2 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -168,7 +168,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectEvents: - client: "client" @@ -463,7 +462,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectEvents: - client: "client" @@ -891,7 +889,6 @@ tests: name: createChangeStream arguments: pipeline: [] - saveResultAsEntity: changeStream expectEvents: - client: "client" diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index d2f26291d3..4d80753b04 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -91,9 +91,6 @@ tests: {{arg}} {%- endfor -%} {%- endif %} - {%- if operation.operation_name == "createChangeStream" %} - saveResultAsEntity: changeStream - {%- endif %} expectEvents: - client: "client" From def5fbd0af3aee58f700a83dd87474d94bbb96a9 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 11 Dec 2025 13:55:02 -0700 Subject: [PATCH 18/55] squash: other comments --- .../tests/backpressure-retry-loop.yml | 64 +++++++++---------- .../backpressure-retry-loop.yml.template | 2 +- 2 files changed, 33 insertions(+), 33 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 538ee413e2..a566949453 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -76,7 +76,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listDatabases @@ -122,7 +122,7 @@ tests: name: listDatabaseNames expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listDatabases @@ -170,7 +170,7 @@ tests: pipeline: [] expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -223,7 +223,7 @@ tests: document: { _id: 8, x: 88 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: bulkWrite @@ -271,7 +271,7 @@ tests: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -319,7 +319,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listCollections @@ -367,7 +367,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listCollections @@ -416,7 +416,7 @@ tests: commandName: ping expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: ping @@ -464,7 +464,7 @@ tests: pipeline: [] expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -512,7 +512,7 @@ tests: pipeline: [] expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -560,7 +560,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -606,7 +606,7 @@ tests: name: estimatedDocumentCount expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: count @@ -655,7 +655,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: distinct @@ -703,7 +703,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: find @@ -751,7 +751,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: find @@ -797,7 +797,7 @@ tests: name: listIndexes expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listIndexes @@ -843,7 +843,7 @@ tests: name: listIndexNames expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: listIndexes @@ -891,7 +891,7 @@ tests: pipeline: [] expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: aggregate @@ -939,7 +939,7 @@ tests: document: { _id: 2, x: 22 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: insert @@ -988,7 +988,7 @@ tests: - { _id: 2, x: 22 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: insert @@ -1036,7 +1036,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: delete @@ -1084,7 +1084,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: delete @@ -1133,7 +1133,7 @@ tests: replacement: { x: 22 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: update @@ -1182,7 +1182,7 @@ tests: update: { $set: { x: 22 } } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: update @@ -1231,7 +1231,7 @@ tests: update: { $set: { x: 22 } } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: update @@ -1279,7 +1279,7 @@ tests: filter: {} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: findAndModify @@ -1328,7 +1328,7 @@ tests: replacement: { x: 22 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: findAndModify @@ -1377,7 +1377,7 @@ tests: update: { $set: { x: 22 } } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: findAndModify @@ -1427,7 +1427,7 @@ tests: document: { _id: 2, x: 22 } expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: insert @@ -1476,7 +1476,7 @@ tests: name: "x_11" expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: createIndexes @@ -1530,7 +1530,7 @@ tests: name: "x_11" expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: dropIndexes @@ -1576,7 +1576,7 @@ tests: name: dropIndexes expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: dropIndexes diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index 4d80753b04..ac47783e53 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -93,7 +93,7 @@ tests: {%- endif %} expectEvents: - - client: "client" + - client: *client events: - commandStartedEvent: commandName: {{operation.command_name}} From 08da5c4fa872180a56c4fa21fa13d869fffd89d8 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 11 Dec 2025 14:13:26 -0700 Subject: [PATCH 19/55] last round comments --- source/client-backpressure/tests/README.md | 12 +- .../backpressure-retry-max-attempts.json | 162 +++++--- .../tests/backpressure-retry-max-attempts.yml | 388 ++++++++++-------- ...ckpressure-retry-max-attempts.yml.template | 16 +- 4 files changed, 338 insertions(+), 240 deletions(-) diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md index 41ad017c6c..b4b03085e6 100644 --- a/source/client-backpressure/tests/README.md +++ b/source/client-backpressure/tests/README.md @@ -16,8 +16,8 @@ be manually implemented by each driver. Drivers should test that retries do not occur immediately when a SystemOverloadedError is encountered. -1. let `client` be a `MongoClient` -2. let `collection` be a collection +1. Let `client` be a `MongoClient` +2. Let `collection` be a collection 3. Now, run transactions without backoff: 1. Configure the random number generator used for jitter to always return `0` -- this effectively disables backoff. @@ -28,14 +28,14 @@ Drivers should test that retries do not occur immediately when a SystemOverloade configureFailPoint: 'failCommand', mode: 'alwaysOn', data: { - failCommands: ['insert'], - errorCode: 2, - errorLabels: ['SystemOverloadedError', 'RetryableError'] + failCommands: ['insert'], + errorCode: 2, + errorLabels: ['SystemOverloadedError', 'RetryableError'] } } ``` - 3. Execute the following command. Expect that the command errors. Measure the duration of the command execution. + 3. Execute the document `{ a: 1 }`. Expect that the command errors. Measure the duration of the command execution. ```javascript const start = performance.now(); diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index a499aa490b..0cd52e9a53 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -25,7 +25,7 @@ }, { "client": { - "id": "failPointClient", + "id": "fail_point_client", "useMultipleMongoses": false } }, @@ -68,7 +68,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -92,7 +92,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -171,7 +172,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -192,7 +193,8 @@ "object": "client", "name": "listDatabaseNames", "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -271,7 +273,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -295,7 +297,8 @@ "pipeline": [] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -379,7 +382,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -413,7 +416,8 @@ ] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -492,7 +496,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -523,7 +527,8 @@ ] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -602,7 +607,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -626,7 +631,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -705,7 +711,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -729,7 +735,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -808,7 +815,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -835,7 +842,8 @@ "commandName": "ping" }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -914,7 +922,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -938,7 +946,8 @@ "pipeline": [] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1017,7 +1026,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1041,7 +1050,8 @@ "pipeline": [] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1120,7 +1130,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1144,7 +1154,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1223,7 +1234,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1244,7 +1255,8 @@ "object": "collection", "name": "estimatedDocumentCount", "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1323,7 +1335,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1348,7 +1360,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1427,7 +1440,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1451,7 +1464,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1530,7 +1544,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1554,7 +1568,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1633,7 +1648,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1654,7 +1669,8 @@ "object": "collection", "name": "listIndexes", "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1733,7 +1749,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1754,7 +1770,8 @@ "object": "collection", "name": "listIndexNames", "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1833,7 +1850,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1857,7 +1874,8 @@ "pipeline": [] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -1936,7 +1954,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -1963,7 +1981,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2042,7 +2061,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2071,7 +2090,8 @@ ] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2150,7 +2170,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2174,7 +2194,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2253,7 +2274,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2277,7 +2298,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2356,7 +2378,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2383,7 +2405,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2462,7 +2485,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2491,7 +2514,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2570,7 +2594,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2599,7 +2623,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2678,7 +2703,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2702,7 +2727,8 @@ "filter": {} }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2781,7 +2807,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2808,7 +2834,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2887,7 +2914,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -2916,7 +2943,8 @@ } }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -2995,7 +3023,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -3028,7 +3056,8 @@ ] }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -3107,7 +3136,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -3134,7 +3163,8 @@ "name": "x_11" }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -3213,7 +3243,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -3237,7 +3267,8 @@ "name": "x_11" }, "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], @@ -3316,7 +3347,7 @@ "name": "failPoint", "object": "testRunner", "arguments": { - "client": "failPointClient", + "client": "fail_point_client", "failPoint": { "configureFailPoint": "failCommand", "mode": "alwaysOn", @@ -3337,7 +3368,8 @@ "object": "collection", "name": "dropIndexes", "expectError": { - "isError": true + "isError": true, + "isClientError": false } } ], diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 3bd4582757..05da92d671 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -14,11 +14,11 @@ createEntities: client: id: &client client useMultipleMongoses: false - observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - client: - id: &failPointClient failPointClient + id: &fail_point_client fail_point_client useMultipleMongoses: false - @@ -44,17 +44,18 @@ tests: - description: 'client.listDatabases retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listDatabases] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -64,12 +65,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listDatabases - commandFailedEvent: @@ -98,17 +100,18 @@ tests: - description: 'client.listDatabaseNames retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listDatabases] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -116,12 +119,13 @@ tests: name: listDatabaseNames expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listDatabases - commandFailedEvent: @@ -150,17 +154,18 @@ tests: - description: 'client.createChangeStream retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -170,12 +175,13 @@ tests: pipeline: [] expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -206,17 +212,18 @@ tests: description: 'client.clientBulkWrite retries at most maxAttempts=5 times' runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [bulkWrite] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -229,12 +236,13 @@ tests: document: { _id: 8, x: 88 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: bulkWrite - commandFailedEvent: @@ -263,17 +271,18 @@ tests: - description: 'database.aggregate retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -283,12 +292,13 @@ tests: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -317,17 +327,18 @@ tests: - description: 'database.listCollections retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listCollections] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -337,12 +348,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listCollections - commandFailedEvent: @@ -371,17 +383,18 @@ tests: - description: 'database.listCollectionNames retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listCollections] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -391,12 +404,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listCollections - commandFailedEvent: @@ -425,17 +439,18 @@ tests: - description: 'database.runCommand retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [ping] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -446,12 +461,13 @@ tests: commandName: ping expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: ping - commandFailedEvent: @@ -480,17 +496,18 @@ tests: - description: 'database.createChangeStream retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -500,12 +517,13 @@ tests: pipeline: [] expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -534,17 +552,18 @@ tests: - description: 'collection.aggregate retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -554,12 +573,13 @@ tests: pipeline: [] expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -588,17 +608,18 @@ tests: - description: 'collection.countDocuments retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -608,12 +629,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -642,17 +664,18 @@ tests: - description: 'collection.estimatedDocumentCount retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [count] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -660,12 +683,13 @@ tests: name: estimatedDocumentCount expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: count - commandFailedEvent: @@ -694,17 +718,18 @@ tests: - description: 'collection.distinct retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [distinct] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -715,12 +740,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: distinct - commandFailedEvent: @@ -749,17 +775,18 @@ tests: - description: 'collection.find retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [find] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -769,12 +796,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: find - commandFailedEvent: @@ -803,17 +831,18 @@ tests: - description: 'collection.findOne retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [find] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -823,12 +852,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: find - commandFailedEvent: @@ -857,17 +887,18 @@ tests: - description: 'collection.listIndexes retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -875,12 +906,13 @@ tests: name: listIndexes expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listIndexes - commandFailedEvent: @@ -909,17 +941,18 @@ tests: - description: 'collection.listIndexNames retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [listIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -927,12 +960,13 @@ tests: name: listIndexNames expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: listIndexes - commandFailedEvent: @@ -961,17 +995,18 @@ tests: - description: 'collection.createChangeStream retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [aggregate] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -981,12 +1016,13 @@ tests: pipeline: [] expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: aggregate - commandFailedEvent: @@ -1015,17 +1051,18 @@ tests: - description: 'collection.insertOne retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1035,12 +1072,13 @@ tests: document: { _id: 2, x: 22 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: insert - commandFailedEvent: @@ -1069,17 +1107,18 @@ tests: - description: 'collection.insertMany retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1090,12 +1129,13 @@ tests: - { _id: 2, x: 22 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: insert - commandFailedEvent: @@ -1124,17 +1164,18 @@ tests: - description: 'collection.deleteOne retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [delete] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1144,12 +1185,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: delete - commandFailedEvent: @@ -1178,17 +1220,18 @@ tests: - description: 'collection.deleteMany retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [delete] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1198,12 +1241,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: delete - commandFailedEvent: @@ -1232,17 +1276,18 @@ tests: - description: 'collection.replaceOne retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1253,12 +1298,13 @@ tests: replacement: { x: 22 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: update - commandFailedEvent: @@ -1287,17 +1333,18 @@ tests: - description: 'collection.updateOne retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1308,12 +1355,13 @@ tests: update: { $set: { x: 22 } } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: update - commandFailedEvent: @@ -1342,17 +1390,18 @@ tests: - description: 'collection.updateMany retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [update] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1363,12 +1412,13 @@ tests: update: { $set: { x: 22 } } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: update - commandFailedEvent: @@ -1397,17 +1447,18 @@ tests: - description: 'collection.findOneAndDelete retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1417,12 +1468,13 @@ tests: filter: {} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: findAndModify - commandFailedEvent: @@ -1451,17 +1503,18 @@ tests: - description: 'collection.findOneAndReplace retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1472,12 +1525,13 @@ tests: replacement: { x: 22 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: findAndModify - commandFailedEvent: @@ -1506,17 +1560,18 @@ tests: - description: 'collection.findOneAndUpdate retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [findAndModify] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1527,12 +1582,13 @@ tests: update: { $set: { x: 22 } } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: findAndModify - commandFailedEvent: @@ -1561,17 +1617,18 @@ tests: - description: 'collection.bulkWrite retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [insert] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1583,12 +1640,13 @@ tests: document: { _id: 2, x: 22 } expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: insert - commandFailedEvent: @@ -1617,17 +1675,18 @@ tests: - description: 'collection.createIndex retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [createIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1638,12 +1697,13 @@ tests: name: "x_11" expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: createIndexes - commandFailedEvent: @@ -1672,17 +1732,18 @@ tests: - description: 'collection.dropIndex retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [dropIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1692,12 +1753,13 @@ tests: name: "x_11" expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: dropIndexes - commandFailedEvent: @@ -1726,17 +1788,18 @@ tests: - description: 'collection.dropIndexes retries at most maxAttempts=5 times' + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [dropIndexes] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -1744,12 +1807,13 @@ tests: name: dropIndexes expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: dropIndexes - commandFailedEvent: diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 4f2cfeee47..9efbdfff19 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -14,11 +14,11 @@ createEntities: client: id: &client client useMultipleMongoses: false - observeEvents: [ 'commandStartedEvent', 'commandSucceededEvent', 'commandFailedEvent' ] + observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - client: - id: &failPointClient failPointClient + id: &fail_point_client fail_point_client useMultipleMongoses: false - @@ -48,17 +48,18 @@ tests: runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} + operations: - name: failPoint object: testRunner arguments: - client: *failPointClient + client: *fail_point_client failPoint: configureFailPoint: failCommand mode: alwaysOn data: failCommands: [{{operation.command_name}}] - errorLabels: ["RetryableError", "SystemOverloadedError"] + errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - @@ -72,12 +73,13 @@ tests: {%- endif %} expectError: isError: true + isClientError: false expectEvents: - - client: "client" + - client: *client events: - # we expect 6 pairs of command started and succeeded events: 1 initial - # attempt and 5 retries. + # we expect 6 pairs of command started and succeeded events: + # 1 initial attempt and 5 retries. - commandStartedEvent: commandName: {{operation.command_name}} - commandFailedEvent: From 034b85eaac791eef94288738e85b52bd1c96a1bb Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 15 Dec 2025 15:52:47 -0700 Subject: [PATCH 20/55] unified retry loop, handshake phrasing change --- .../client-backpressure.md | 243 +++--------------- source/mongodb-handshake/tests/README.md | 2 +- 2 files changed, 43 insertions(+), 202 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 1c0ba0a999..79e3db5830 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -95,13 +95,12 @@ collection, getMore, and generic runCommand. The new command execution method ob token. - A non-SystemOverloaded error indicates that the server is healthy enough to handle requests. For the purposes of retry budget tracking, this counts as a success. -4. A retry attempt will only be permitted if the error includes the `RetryableError` label, the error has a +4. A retry attempt will only be permitted if the error is eligible for retryable reads or writes, the error has a `SystemOverloadedError` label, we have not reached `MAX_ATTEMPTS`, the CSOT deadline has not expired, and a token can be acquired from the token bucket. - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. - - Note: Future work will add support for RetryableErrors to regular retryability logic (see the future work section). 5. If a retry attempt is to be attempted, a token will be consumed from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` @@ -110,26 +109,35 @@ collection, getMore, and generic runCommand. The new command execution method ob - `baseBackoff` is constant 100ms. - `maxBackoff` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. -7. If the request is eligible for retry (as outlined in step 4), the client MUST add the server's address to the list of - deprioritized server address for server selection. This behavior is the same as existing behavior for retryable - reads and writes. +7. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's + address to the list of deprioritized server addresses for server selection. -Note: drivers MUST share deprioritized servers between retries used for the exponential backoff loop and regular -retryable reads and writes. +#### Interaction with Existing Retry Behavior + +The retryability API defined in this specification is separate from the existing retryability behaviors defined in the +retryable reads and retryable writes specifications. Drivers MUST: + +- Only retryable errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. +- Only retryable errors with the `SystemOverloadedError` label apply backoff and jitter. #### Pseudocode The following pseudocode describes the overload retry policy: ```python -BASE_BACKOFF = 0.1 -MAX_BACKOFF = 10 +# Note: the values below have been scaled down by a factor of 1000 because +# Python's sleep API takes a duration in seconds, not milliseconds. +BASE_BACKOFF = 0.1 # 100ms +MAX_BACKOFF = 10 # 10s + RETRY_TOKEN_RETURN_RATE = 0.1 MAX_ATTEMPTS = 5 def execute_command_retryable(command, ...): deprioritized_servers = [] attempt = 0 + attempts = if is_csot then 1 else math.inf + while True: try: server = select_server(deprioritized_servers) @@ -142,206 +150,39 @@ def execute_command_retryable(command, ...): token_bucket.deposit(tokens) return res except PyMongoError as exc: + is_retryable = is_retryable_read() or is_retryable_write() or (exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError")) + is_overload = exc.has_error_label("SystemOverloadedError") + # if a retry fails with a non-System overloaded error, deposit 1 token - if attempt > 0 and not exc.has_error_label("SystemOverloadedError"): - tokens += 1 + if attempt > 0 and not is_overload: + token_bucket.deposit(1) + + # Raise if the error is non-retryable. + if not is_retryable: + raise attempt += 1 + if is_overload: + attempts = MAX_ATTEMPTS - if attempt >= MAX_ATTEMPTS: + if attempt >= attempts: raise - # Raise if the error if non retryable. - if exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError"): - raise + deprioritized_servers.append(server.address) - jitter = random.random() # Random float between [0.0, 1.0). - backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) - - # If the delay exceeds the deadline, bail early before consuming a token. - if _csot.get_timeout(): - if time.monotonic() + backoff > _csot.get_deadline(): - raise + if is_overload: + jitter = random.random() # Random float between [0.0, 1.0). + backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) + + # If the delay exceeds the deadline, bail early. + if _csot.get_timeout(): + if time.monotonic() + backoff > _csot.get_deadline(): + raise - if not token_bucket.consume(1): - raise + if not token_bucket.consume(1): + raise - if backoff: time.sleep(backoff) - - deprioritized_servers.append(server) - continue -``` - -Some drivers might not have retryability implementations that allow easy separation of the existing retryable -reads/writes mechanisms from the exponential backoff and jitter retry algorithm. An example pseudocode is defined below -that demonstrates a combined retryable reads/writes implementation with the corresponding backpressure changes (adapted -from the Node driver's implementation): - -```typescript -// TODO: update pseudocode with updated implementation -async function tryOperation>( - operation: T, - { topology, timeoutContext, session, readPreference }: RetryOptions -): Promise { - const serverSelector = getServerSelectorForReadPreference(operation, readPreference); - - let server = await topology.selectServer(selector, { - session, - }); - - const hasReadAspect = operation.hasAspect(Aspect.READ_OPERATION); - const hasWriteAspect = operation.hasAspect(Aspect.WRITE_OPERATION); - const inTransaction = session?.inTransaction() ?? false; - - const willRetryRead = topology.s.options.retryReads && !inTransaction && operation.canRetryRead; - - const willRetryWrite = - topology.s.options.retryWrites && - !inTransaction && - supportsRetryableWrites(server) && - operation.canRetryWrite; - - const willRetry = - operation.hasAspect(Aspect.RETRYABLE) && - session != null && - ((hasReadAspect && willRetryRead) || (hasWriteAspect && willRetryWrite)); - - if (hasWriteAspect && willRetryWrite && session != null) { - operation.options.willRetryWrite = true; - session.incrementTransactionNumber(); - } - - // The maximum number of retry attempts using regular retryable reads/writes logic (not including - // SystemOverLoad error retries). - const maxNonOverloadRetryAttempts = willRetry - ? timeoutMS != null - ? Infinity - : 2 - : 1; - - let previousOperationError: MongoError | undefined; - let previousServer: ServerDescription | undefined; - - let nonOverloadRetryAttempt = 0; - let systemOverloadRetryAttempt = 0; - - const maxSystemOverloadRetryAttempts = 5; - const backoffDelayProvider = exponentialBackoffDelayProvider( - 10_000, // MAX_BACKOFF - 100, // base backoff - 2 // backoff rate - ); - - const RETRY_COST = 1; - - while (true) { - if (previousOperationError) { - if (previousOperationError.hasErrorLabel("SystemOverloadError")) { - systemOverloadRetryAttempt += 1; - - if ( - // if the SystemOverloadError is not retryable, throw. - !previousOperationError.hasErrorLabel("RetryableError") || - !( - // if retryable writes or reads are not configured, throw. - ( - (hasReadAspect && topology.s.options.retryReads) || - (hasWriteAspect && topology.s.options.retryWrites) - ) - ) - ) { - throw previousOperationError; - } - - // if we have exhausted overload retry attempts, throw. - if (systemOverloadRetryAttempt > maxSystemOverloadRetryAttempts) { - throw previousOperationError; - } - - const { value: delayMS } = backoffDelayProvider.next(); - - // if the delay would exhaust the CSOT timeout, short-circuit. - if (timeoutContext.csotEnabled() && delayMS > timeoutContext.remainingTimeMS) { - throw previousError; - } - - await setTimeout(delayMS); - - // attempt to consume a retry token, throw if we don't have budget. - if (!topology.tokenBucket.consume(RETRY_COST)) { - throw previousOperationError; - } - - server = await topology.selectServer(selector, { session }); - } else { - nonOverloadRetryAttempt++; - // we have no more retry attempts, throw. - if (nonOverloadRetryAttempt > maxNonOverloadRetryAttempts) { - throw previousOperationError; - } - - // Handle MMAPv1 not supporting retryable writes. - if (hasWriteAspect && previousOperationError.code === MMAPv1_RETRY_WRITES_ERROR_CODE) { - throw new MongoServerError({ - message: MMAPv1_RETRY_WRITES_ERROR_MESSAGE, - errmsg: MMAPv1_RETRY_WRITES_ERROR_MESSAGE, - originalError: previousOperationError - }); - } - - // handle non-retryable errors - if ( - (hasWriteAspect && !isRetryableWriteError(previousOperationError)) || - (hasReadAspect && !isRetryableReadError(previousOperationError)) - ) { - throw previousOperationError; - } - - server = await topology.selectServer(selector, { session }); - - // handle rare downgrade scenarios where some nodes don't support - // retryable writes but others do. - if (hasWriteAspect && !supportsRetryableWrites(server)) { - throw new MongoUnexpectedServerResponseError( - 'Selected server does not support retryable writes' - ); - } - } - } - - try { - try { - const result = await server.command(operation, timeoutContext); - const isRetry = nonOverloadRetryAttempt > 0 || systemOverloadRetryAttempt > 0; - topology.tokenBucket.deposit( - isRetry - ? // on successful retry, deposit the retry cost + the refresh rate. - TOKEN_REFRESH_RATE + RETRY_COST - : // otherwise, just deposit the refresh rate. - TOKEN_REFRESH_RATE - ); - return operation.handleOk(result); - } catch (error) { - return operation.handleError(error); - } - } catch (operationError) { - if (!operationError.hasErrorLabel("SystemOverloadError")) { - // if an operation fails with an error that does not contain the SystemOverloadError, deposit 1 token. - topology.tokenBucket.deposit(RETRY_COST); - } - - if ( - previousOperationError != null && - operationError.hasErrorLabel("NoWritesPerformed") - ) { - throw previousOperationError; - } - previousServer = server.description; - previousOperationError = operationError; - } - } -} ``` ### Token Bucket @@ -431,8 +272,8 @@ number of errors users see during spikes or burst workloads and help prevent ret However, older drivers do not have this benefit. Drivers MUST document that: - Users SHOULD upgrade to driver versions that officially support backpressure to avoid any impacts of server changes. -- Users who do not upgrade might see increased might need to update application error handling to handle higher error - rates of SystemOverloadedErrors. +- Users who do not upgrade might need to update application error handling to handle higher error rates of + SystemOverloadedErrors. ## Test Plan diff --git a/source/mongodb-handshake/tests/README.md b/source/mongodb-handshake/tests/README.md index 0ba58b713a..681b7e4907 100644 --- a/source/mongodb-handshake/tests/README.md +++ b/source/mongodb-handshake/tests/README.md @@ -499,4 +499,4 @@ These tests require a mechanism for observing handshake documents sent to the se 3. Assert that for every handshake document intercepted: - 1. the document has a field `backpressure` whose value is `true`. + 1. The document has a field `backpressure` whose value is `true`. From 779e1711bc67ae61a5ec9daa01d5fb20ee447042 Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 15 Dec 2025 16:15:26 -0700 Subject: [PATCH 21/55] Jeremy's last comments --- .../client-backpressure/tests/backpressure-retry-loop.json | 3 +++ .../client-backpressure/tests/backpressure-retry-loop.yml | 7 +++++-- .../tests/backpressure-retry-loop.yml.template | 5 ++++- .../tests/backpressure-retry-max-attempts.json | 3 +++ .../tests/backpressure-retry-max-attempts.yml | 5 ++++- .../tests/backpressure-retry-max-attempts.yml.template | 3 +++ source/etc/generate-backpressure-retryability-tests.py | 2 +- 7 files changed, 23 insertions(+), 5 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 79cfc4bac7..c4aab441a3 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -58,6 +58,9 @@ } } ], + "_yamlAnchors": { + "bulWriteInsertNamespace": "retryable-writes-tests.coll" + }, "tests": [ { "description": "client.listDatabases retries using operation loop", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index a566949453..0112330fcf 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -37,7 +37,7 @@ createEntities: database: id: &database database client: *client - databaseName: &database_name retryable-writes-tests + databaseName: *database_name - collection: @@ -45,6 +45,9 @@ createEntities: database: *database collectionName: *collection_name +_yamlAnchors: + bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll + tests: - @@ -219,7 +222,7 @@ tests: arguments: models: - insertOne: - namespace: retryable-writes-tests.coll + namespace: *client_bulk_write_ns document: { _id: 8, x: 88 } expectEvents: diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index ac47783e53..f83f462e8a 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -37,7 +37,7 @@ createEntities: database: id: &database database client: *client - databaseName: &database_name retryable-writes-tests + databaseName: *database_name - collection: @@ -45,6 +45,9 @@ createEntities: database: *database collectionName: *collection_name +_yamlAnchors: + bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll + tests: {% for operation in operations %} - diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 0cd52e9a53..1de8cb38d4 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -44,6 +44,9 @@ } } ], + "_yamlAnchors": { + "bulkWriteInsertNamespace": "retryable-writes-tests.coll" + }, "initialData": [ { "collectionName": "coll", diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 05da92d671..3800b20a33 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -32,6 +32,9 @@ createEntities: database: *database collectionName: &collection_name coll +_yamlAnchors: + bulkWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll + initialData: - collectionName: *collection_name @@ -232,7 +235,7 @@ tests: arguments: models: - insertOne: - namespace: retryable-writes-tests.coll + namespace: *client_bulk_write_ns document: { _id: 8, x: 88 } expectError: isError: true diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 9efbdfff19..3117d44b89 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -32,6 +32,9 @@ createEntities: database: *database collectionName: &collection_name coll +_yamlAnchors: + bulkWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll + initialData: - collectionName: *collection_name diff --git a/source/etc/generate-backpressure-retryability-tests.py b/source/etc/generate-backpressure-retryability-tests.py index 305cfa585d..3d8d914e1f 100644 --- a/source/etc/generate-backpressure-retryability-tests.py +++ b/source/etc/generate-backpressure-retryability-tests.py @@ -8,7 +8,7 @@ CLIENT_BULK_WRITE_ARGUMENTS = '''models: - insertOne: - namespace: retryable-writes-tests.coll + namespace: *client_bulk_write_ns document: { _id: 8, x: 88 }''' CLIENT_OPERATIONS = [ From e5d4de69c2fa65c568750730a5babfc04610c6fd Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 16 Dec 2025 07:48:13 -0700 Subject: [PATCH 22/55] Other misc comments --- .../client-backpressure.md | 47 ++++++++++++++----- 1 file changed, 35 insertions(+), 12 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 79e3db5830..a8eeb714bb 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -40,9 +40,6 @@ the connection and request rate limiters to prevent and mitigate overloading the An error is considered retryable if it includes the "RetryableError" label. This error label indicates that an operation is safely retryable regardless of the type of operation, its metadata, or any of its arguments. -Note that for the initial draft of the spec, only errors that have both the RetryableError label and the -SystemOverloadedError label are eligible for the retry backoff loop. - #### SystemOverloadedError label An error is considered overloaded if it includes the "SystemOverloadError" label. This error label indicates that the @@ -84,7 +81,7 @@ See [goodput](https://en.wikipedia.org/wiki/Goodput). #### Overload retry policy -This specification expands the driver's retry ability to all commands if the error indicates that is both an overload +This specification expands the driver's retry ability to all commands if the error indicates that it is both an overload error and that it is retryable, including those not currently considered retryable such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys the following rules: @@ -95,9 +92,11 @@ collection, getMore, and generic runCommand. The new command execution method ob token. - A non-SystemOverloaded error indicates that the server is healthy enough to handle requests. For the purposes of retry budget tracking, this counts as a success. -4. A retry attempt will only be permitted if the error is eligible for retryable reads or writes, the error has a - `SystemOverloadedError` label, we have not reached `MAX_ATTEMPTS`, the CSOT deadline has not expired, and a token - can be acquired from the token bucket. +4. A retry attempt will only be permitted if: + 1. The error has both the `SystemOverloadedError` and the `RetryableError` label. + 2. We have not reached `MAX_ATTEMPTS`. + 3. (CSOT-only): `timeoutMS` has not expired. + 4. (`SystemOverloadedError` errors only) a token can be acquired from the token bucket. - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. @@ -115,10 +114,16 @@ collection, getMore, and generic runCommand. The new command execution method ob #### Interaction with Existing Retry Behavior The retryability API defined in this specification is separate from the existing retryability behaviors defined in the -retryable reads and retryable writes specifications. Drivers MUST: +retryable reads and retryable writes specifications. Drivers MUST ensure: - Only retryable errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. - Only retryable errors with the `SystemOverloadedError` label apply backoff and jitter. +- All retryable errors apply backoff if they also contain a `SystemOverloadedError` label. This includes: + - Errors defined as retryable in the retryable reads specification. + - Errors defined as retryable in the retryable writes specification. + - Errors with the `RetryableError` label. +- Any retryable error is retried at most MAX_ATTEMPTS (default=5) times, if any attempts has failed with a + `SystemOverloadedError`. #### Pseudocode @@ -136,7 +141,7 @@ MAX_ATTEMPTS = 5 def execute_command_retryable(command, ...): deprioritized_servers = [] attempt = 0 - attempts = if is_csot then 1 else math.inf + attempts = if is_csot then math.inf else 1 while True: try: @@ -150,7 +155,7 @@ def execute_command_retryable(command, ...): token_bucket.deposit(tokens) return res except PyMongoError as exc: - is_retryable = is_retryable_read() or is_retryable_write() or (exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError")) + is_retryable = is_retryable_write() or is_retryable_read() or (exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError")) is_overload = exc.has_error_label("SystemOverloadedError") # if a retry fails with a non-System overloaded error, deposit 1 token @@ -165,14 +170,14 @@ def execute_command_retryable(command, ...): if is_overload: attempts = MAX_ATTEMPTS - if attempt >= attempts: + if attempt > attempts: raise deprioritized_servers.append(server.address) if is_overload: jitter = random.random() # Random float between [0.0, 1.0). - backoff = jitter * min(BASE_BACKOFF * (2 ** attempt), MAX_BACKOFF) + backoff = jitter * min(BASE_BACKOFF * (2 ** attempt - 1), MAX_BACKOFF) # If the delay exceeds the deadline, bail early. if _csot.get_timeout(): @@ -309,6 +314,24 @@ The Node and Python drivers will provide the reference implementations. See The client backpressure retry loop is primarily concerned with spreading out retries to avoid retry storms. The exact sleep duration is not critical to the intended behavior, so long as we sleep at least as long as we say we will. +### Why override existing maximum number of retry attempt defaults for retryable reads and writes if a `SystemOverloadedError` is received? + +Load-shedded errors indicate that the request was rejected by the server to minimize load, not that the operation failed +for logical reasons. So, when determining the number of retries an operation should attempt: + +- Any load-shedded errors should be retried to give them a real attempt at success +- If the command ultimately would have failed if it had not been load shed by the server, returning an actionable error + message is preferable to a generic SystemOverloadedError. + +The maximum retry attempt logic in this specification balances legacy retryability behavior with load-shedding behavior: + +- Relying on either 1 or infinite timeouts (depending on CSOT) preserves existing retry behavior. +- Adjusting the maximum number of retry attempts to 5 if a `SystemOverloadedError` error is returned from the server + gives requests more opportunities to succeed and helps reduce application errors. +- An alternative approach would be to retry once if we don't receive a SystemOverloadedError, in which case we'd retry 5 + times. The approach chosen allows for additional retries in scenarios where a non-`SystemOverloadedError` fails on a + retry with a `SystemOverloadedError`. + ## Changelog - 2025-XX-XX: Initial version. From 1cd95fc2955cd08d8fcf25f20c2d730aee86aca8 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Wed, 17 Dec 2025 06:41:30 -0600 Subject: [PATCH 23/55] update transaction spec and add unified tests --- .../unified/backpressure-retryable-reads.json | 418 +++++++++++++++++ .../unified/backpressure-retryable-reads.yml | 235 ++++++++++ .../backpressure-retryable-writes.json | 436 ++++++++++++++++++ .../unified/backpressure-retryable-writes.yml | 244 ++++++++++ source/transactions/transactions.md | 21 +- 5 files changed, 1351 insertions(+), 3 deletions(-) create mode 100644 source/transactions/tests/unified/backpressure-retryable-reads.json create mode 100644 source/transactions/tests/unified/backpressure-retryable-reads.yml create mode 100644 source/transactions/tests/unified/backpressure-retryable-writes.json create mode 100644 source/transactions/tests/unified/backpressure-retryable-writes.yml diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.json b/source/transactions/tests/unified/backpressure-retryable-reads.json new file mode 100644 index 0000000000..69388cf887 --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-reads.json @@ -0,0 +1,418 @@ +{ + "description": "backpressure-retryable-reads", + "schemaVersion": "1.3", + "runOnRequirements": [ + { + "minServerVersion": "4.1.8", + "topologies": [ + "replicaset", + "sharded", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client0", + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent" + ] + } + }, + { + "database": { + "id": "database0", + "client": "client0", + "databaseName": "transaction-tests" + } + }, + { + "collection": { + "id": "collection0", + "database": "database0", + "collectionName": "test" + } + }, + { + "session": { + "id": "session0", + "client": "client0" + } + } + ], + "initialData": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ], + "tests": [ + { + "description": "reads are retried if backpressure labels are added", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 1 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "find", + "arguments": { + "filter": {}, + "session": "session0" + } + }, + { + "object": "session0", + "name": "commitTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "find": "test", + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "find", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "find": "test", + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "find", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + }, + { + "description": "reads are retried maxAttempts=5 times if backpressure labels are added", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": "alwaysOn", + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "find", + "arguments": { + "filter": {}, + "session": "session0" + }, + "expectError": { + "isError": true + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + }, + { + "description": "retry fails if backpressure labels are added to the first operation in a transaction", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 1 + }, + "data": { + "failCommands": [ + "find" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "find", + "arguments": { + "filter": {}, + "session": "session0" + }, + "expectError": { + "isError": true + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + } + ] +} diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.yml b/source/transactions/tests/unified/backpressure-retryable-reads.yml new file mode 100644 index 0000000000..79cf65d642 --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-reads.yml @@ -0,0 +1,235 @@ +description: backpressure-retryable-reads +schemaVersion: "1.3" +runOnRequirements: + - minServerVersion: 4.1.8 + topologies: + - replicaset + - sharded + - load-balanced +createEntities: + - client: + id: &client0 client0 + useMultipleMongoses: false + observeEvents: + - commandStartedEvent + - database: + id: &database0 database0 + client: *client0 + databaseName: &databaseName transaction-tests + - collection: + id: &collection0 collection0 + database: *database0 + collectionName: &collectionName test + - session: + id: &session0 session0 + client: *client0 +initialData: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] +tests: + - description: reads are retried if backpressure labels are added + operations: + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 1 + data: + failCommands: + - find + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: find + arguments: + filter: {} + session: *session0 + - object: *session0 + name: commitTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *databaseName + - commandStartedEvent: + command: + find: test + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + autocommit: false + writeConcern: + $$exists: false + commandName: find + databaseName: *databaseName + - commandStartedEvent: + command: + find: test + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + autocommit: false + writeConcern: + $$exists: false + commandName: find + databaseName: *databaseName + - commandStartedEvent: + command: + abortTransaction: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: commitTransaction + databaseName: admin + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] + - description: reads are retried maxAttempts=5 times if backpressure labels are added + operations: + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: alwaysOn + data: + failCommands: + - find + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: find + arguments: + filter: {} + session: *session0 + expectError: + isError: true + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: abortTransaction + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] + - description: retry fails if backpressure labels are added to the first operation in a transaction + operations: + - object: *session0 + name: startTransaction + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 1 + data: + failCommands: + - find + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: find + arguments: + filter: {} + session: *session0 + expectError: + isError: true + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: find + - commandStartedEvent: + commandName: abortTransaction + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.json b/source/transactions/tests/unified/backpressure-retryable-writes.json new file mode 100644 index 0000000000..9525d07e9c --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-writes.json @@ -0,0 +1,436 @@ +{ + "description": "backpressure-retryable-writes", + "schemaVersion": "1.3", + "runOnRequirements": [ + { + "minServerVersion": "4.1.8", + "topologies": [ + "replicaset", + "sharded", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client0", + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent" + ] + } + }, + { + "database": { + "id": "database0", + "client": "client0", + "databaseName": "transaction-tests" + } + }, + { + "collection": { + "id": "collection0", + "database": "database0", + "collectionName": "test" + } + }, + { + "session": { + "id": "session0", + "client": "client0" + } + } + ], + "initialData": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ], + "tests": [ + { + "description": "writes are retried if backpressure labels are added", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 1 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 2 + } + } + }, + { + "object": "session0", + "name": "commitTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 2 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 2 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + }, + { + "description": "writes are retried maxAttempts=5 times if backpressure labels are added", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": "alwaysOn", + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 2 + } + }, + "expectError": { + "isError": true + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + }, + { + "description": "retry fails if backpressure labels are added to the first operation in a transaction", + "operations": [ + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 1 + }, + "data": { + "failCommands": [ + "insert" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 2 + } + }, + "expectError": { + "isError": true + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "insert" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + } + ] +} diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.yml b/source/transactions/tests/unified/backpressure-retryable-writes.yml new file mode 100644 index 0000000000..b3f68712d7 --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-writes.yml @@ -0,0 +1,244 @@ +description: backpressure-retryable-writes +schemaVersion: "1.3" +runOnRequirements: + - minServerVersion: 4.1.8 + topologies: + - replicaset + - sharded + - load-balanced +createEntities: + - client: + id: &client0 client0 + useMultipleMongoses: false + observeEvents: + - commandStartedEvent + - database: + id: &database0 database0 + client: *client0 + databaseName: &databaseName transaction-tests + - collection: + id: &collection0 collection0 + database: *database0 + collectionName: &collectionName test + - session: + id: &session0 session0 + client: *client0 +initialData: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] +tests: + - description: writes are retried if backpressure labels are added + operations: + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 1 + data: + failCommands: + - insert + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 2 + - object: *session0 + name: commitTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *databaseName + - commandStartedEvent: + command: + insert: test + documents: + - _id: 2 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *databaseName + - commandStartedEvent: + command: + insert: test + documents: + - _id: 2 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *databaseName + - commandStartedEvent: + command: + abortTransaction: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: commitTransaction + databaseName: admin + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] + - description: writes are retried maxAttempts=5 times if backpressure labels are added + operations: + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: alwaysOn + data: + failCommands: + - insert + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 2 + expectError: + isError: true + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: abortTransaction + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] + - description: retry fails if backpressure labels are added to the first operation in a transaction + operations: + - object: *session0 + name: startTransaction + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 1 + data: + failCommands: + - insert + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 2 + expectError: + isError: true + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: insert + - commandStartedEvent: + commandName: abortTransaction + outcome: + - collectionName: *collectionName + databaseName: *databaseName + documents: [] diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 1e270cbed0..9c812058db 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -48,6 +48,10 @@ including (but not limited to) creating, updating, or deleting databases, collec An error considered retryable by the [Retryable Writes Specification](../retryable-writes/retryable-writes.md). +#### Backpressure Error + +An error considered retryable by the [Client Backpressure Specification](../client-backpressure/client-backpressure.md). + #### Command Error A server response with ok:0. A server response with ok:1 and writeConcernError or writeErrors is not considered a @@ -555,9 +559,10 @@ a transaction. In MongoDB 4.0 the only supported retryable write commands within a transaction are commitTransaction and abortTransaction. Therefore drivers MUST NOT retry write commands within transactions even when retryWrites has been -enabled on the MongoClient. In addition, drivers MUST NOT add the RetryableWriteError label to any error that occurs -during a write command within a transaction (excepting commitTransation and abortTransaction), even when retryWrites has -been enabled on the MongoClient. +enabled on the MongoClient, unless the command has backpressure error labels applied. In addition, drivers MUST NOT add +the RetryableWriteError label to any error that occurs during a write command within a transaction (excepting +commitTransation and abortTransaction), even when retryWrites has been enabled on the MongoClient, unless the command +has backpressure error labels applied. Drivers MUST retry the commitTransaction and abortTransaction commands even when retryWrites has been disabled on the MongoClient. commitTransaction and abortTransaction are retryable write commands and MUST be retried according to the @@ -569,6 +574,16 @@ incremented at the start and then stays constant, even for retryable operations the commitTransaction and abortTransaction commands within a transaction drivers MUST use the same `txnNumber` used for all preceding commands in the transaction. +### **Interaction with Client Backpressure** + +All commands in a transaction are subject to the +[Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly when +the appropriate error labels are added by the server. This includes the `startTransaction`, `abortTransaction`, +`commitTransaction` commands as well as any read or write commands attempted during the transaction. + +In the case that the first command in a transaction has backpressure applied, it will eventually fail because the server +will not have started a transaction. + ### **Server Commands** #### commitTransaction From 6912e4552ae7ff3fbe39bd927a33d3e7497cc73a Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Wed, 17 Dec 2025 10:18:10 -0600 Subject: [PATCH 24/55] update transaction logic and add more tests --- .../unified/backpressure-retryable-abort.json | 357 +++++++++++++++++ .../unified/backpressure-retryable-abort.yml | 213 ++++++++++ .../backpressure-retryable-commit.json | 379 ++++++++++++++++++ .../unified/backpressure-retryable-commit.yml | 222 ++++++++++ .../unified/backpressure-retryable-reads.json | 78 +--- .../unified/backpressure-retryable-reads.yml | 79 +--- .../backpressure-retryable-writes.json | 7 +- .../unified/backpressure-retryable-writes.yml | 46 ++- source/transactions/transactions.md | 9 +- 9 files changed, 1226 insertions(+), 164 deletions(-) create mode 100644 source/transactions/tests/unified/backpressure-retryable-abort.json create mode 100644 source/transactions/tests/unified/backpressure-retryable-abort.yml create mode 100644 source/transactions/tests/unified/backpressure-retryable-commit.json create mode 100644 source/transactions/tests/unified/backpressure-retryable-commit.yml diff --git a/source/transactions/tests/unified/backpressure-retryable-abort.json b/source/transactions/tests/unified/backpressure-retryable-abort.json new file mode 100644 index 0000000000..7dde45f76e --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-abort.json @@ -0,0 +1,357 @@ +{ + "description": "retryable-abort", + "schemaVersion": "1.3", + "runOnRequirements": [ + { + "minServerVersion": "4.4", + "topologies": [ + "replicaset", + "sharded", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client0", + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent" + ] + } + }, + { + "database": { + "id": "database0", + "client": "client0", + "databaseName": "transaction-tests" + } + }, + { + "collection": { + "id": "collection0", + "database": "database0", + "collectionName": "test" + } + }, + { + "session": { + "id": "session0", + "client": "client0" + } + } + ], + "initialData": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ], + "tests": [ + { + "description": "abortTransaction retries if backpressure labels are added", + "operations": [ + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 2 + }, + "data": { + "failCommands": [ + "abortTransaction" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "abortTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "abortTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "abortTransaction", + "databaseName": "admin" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + }, + { + "description": "abortTransaction is retried maxAttempts=5 times if backpressure labels are added", + "operations": [ + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": "alwaysOn", + "data": { + "failCommands": [ + "abortTransaction" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "session0", + "name": "abortTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "abortTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "abortTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "abortTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ] + } + ] +} diff --git a/source/transactions/tests/unified/backpressure-retryable-abort.yml b/source/transactions/tests/unified/backpressure-retryable-abort.yml new file mode 100644 index 0000000000..35c6c23a90 --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-abort.yml @@ -0,0 +1,213 @@ +description: retryable-abort +schemaVersion: "1.3" +runOnRequirements: + - minServerVersion: "4.4" + topologies: + - replicaset + - sharded + - load-balanced +createEntities: + - + client: + id: &client0 client0 + useMultipleMongoses: false + observeEvents: + - commandStartedEvent + - + database: + id: &database0 database0 + client: *client0 + databaseName: &database_name transaction-tests + - + collection: + id: &collection0 collection0 + database: *database0 + collectionName: &collection_name test + - + session: + id: &session0 session0 + client: *client0 + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: [] +tests: + - description: abortTransaction retries if backpressure labels are added + operations: + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 2 + data: + failCommands: + - abortTransaction + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *database_name + - commandStartedEvent: + command: + abortTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: abortTransaction + databaseName: admin + - commandStartedEvent: + command: + abortTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: abortTransaction + databaseName: admin + - commandStartedEvent: + command: + abortTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: abortTransaction + databaseName: admin + outcome: + - collectionName: *collection_name + databaseName: *database_name + documents: [] + - description: abortTransaction is retried maxAttempts=5 times if backpressure labels are added + operations: + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: alwaysOn + data: + failCommands: + - abortTransaction + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: *session0 + name: abortTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *database_name + - commandStartedEvent: + command: + abortTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: abortTransaction + databaseName: admin + - commandStartedEvent: + commandName: abortTransaction + - commandStartedEvent: + commandName: abortTransaction + - commandStartedEvent: + commandName: abortTransaction + - commandStartedEvent: + commandName: abortTransaction + - commandStartedEvent: + commandName: abortTransaction + outcome: + - collectionName: *collection_name + databaseName: *database_name + documents: [] diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.json b/source/transactions/tests/unified/backpressure-retryable-commit.json new file mode 100644 index 0000000000..9146578564 --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-commit.json @@ -0,0 +1,379 @@ +{ + "description": "retryable-commit", + "schemaVersion": "1.4", + "runOnRequirements": [ + { + "minServerVersion": "4.4", + "topologies": [ + "sharded", + "replicaset", + "load-balanced" + ] + } + ], + "createEntities": [ + { + "client": { + "id": "client0", + "useMultipleMongoses": false, + "observeEvents": [ + "commandStartedEvent" + ] + } + }, + { + "database": { + "id": "database0", + "client": "client0", + "databaseName": "transaction-tests" + } + }, + { + "collection": { + "id": "collection0", + "database": "database0", + "collectionName": "test" + } + }, + { + "session": { + "id": "session0", + "client": "client0" + } + } + ], + "initialData": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [] + } + ], + "tests": [ + { + "description": "commitTransaction retries if backpressure labels are added", + "runOnRequirements": [ + { + "serverless": "forbid" + } + ], + "operations": [ + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 2 + }, + "data": { + "failCommands": [ + "commitTransaction" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "session0", + "name": "commitTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "commitTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "command": { + "commitTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "w": "majority", + "wtimeout": 10000 + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "command": { + "commitTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "w": "majority", + "wtimeout": 10000 + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [ + { + "_id": 1 + } + ] + } + ] + }, + { + "description": "commitTransaction is retried maxAttempts=5 times if backpressure labels are added", + "runOnRequirements": [ + { + "serverless": "forbid" + } + ], + "operations": [ + { + "object": "testRunner", + "name": "failPoint", + "arguments": { + "client": "client0", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 5 + }, + "data": { + "failCommands": [ + "commitTransaction" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 112 + } + } + } + }, + { + "object": "session0", + "name": "startTransaction" + }, + { + "object": "collection0", + "name": "insertOne", + "arguments": { + "session": "session0", + "document": { + "_id": 1 + } + }, + "expectResult": { + "$$unsetOrMatches": { + "insertedId": { + "$$unsetOrMatches": 1 + } + } + } + }, + { + "object": "session0", + "name": "commitTransaction" + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "command": { + "insert": "test", + "documents": [ + { + "_id": 1 + } + ], + "ordered": true, + "readConcern": { + "$$exists": false + }, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": true, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "insert", + "databaseName": "transaction-tests" + } + }, + { + "commandStartedEvent": { + "command": { + "commitTransaction": 1, + "lsid": { + "$$sessionLsid": "session0" + }, + "txnNumber": { + "$numberLong": "1" + }, + "startTransaction": { + "$$exists": false + }, + "autocommit": false, + "writeConcern": { + "$$exists": false + } + }, + "commandName": "commitTransaction", + "databaseName": "admin" + } + }, + { + "commandStartedEvent": { + "commandName": "commitTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "commitTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "commitTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "commitTransaction" + } + }, + { + "commandStartedEvent": { + "commandName": "commitTransaction" + } + } + ] + } + ], + "outcome": [ + { + "collectionName": "test", + "databaseName": "transaction-tests", + "documents": [ + { + "_id": 1 + } + ] + } + ] + } + ] +} diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.yml b/source/transactions/tests/unified/backpressure-retryable-commit.yml new file mode 100644 index 0000000000..99a0a5cb4a --- /dev/null +++ b/source/transactions/tests/unified/backpressure-retryable-commit.yml @@ -0,0 +1,222 @@ +description: retryable-commit +schemaVersion: "1.4" +runOnRequirements: + - minServerVersion: "4.4" + topologies: + - sharded + - replicaset + - load-balanced +createEntities: + - + client: + id: &client0 client0 + useMultipleMongoses: false + observeEvents: + - commandStartedEvent + - + database: + id: &database0 database0 + client: *client0 + databaseName: &database_name transaction-tests + - + collection: + id: &collection0 collection0 + database: *database0 + collectionName: &collection_name test + - + session: + id: &session0 session0 + client: *client0 + +initialData: + - + collectionName: *collection_name + databaseName: *database_name + documents: [] +tests: + - description: commitTransaction retries if backpressure labels are added + runOnRequirements: + - serverless: forbid + operations: + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 2 + data: + failCommands: + - commitTransaction + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: *session0 + name: commitTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *database_name + - commandStartedEvent: + command: + commitTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: commitTransaction + databaseName: admin + - commandStartedEvent: + command: + commitTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + w: majority + wtimeout: 10000 + commandName: commitTransaction + databaseName: admin + - commandStartedEvent: + command: + commitTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + w: majority + wtimeout: 10000 + commandName: commitTransaction + databaseName: admin + outcome: + - collectionName: *collection_name + databaseName: *database_name + documents: + - _id: 1 + - description: commitTransaction is retried maxAttempts=5 times if backpressure labels are added + runOnRequirements: + - serverless: forbid + operations: + - object: testRunner + name: failPoint + arguments: + client: *client0 + failPoint: + configureFailPoint: failCommand + mode: + times: 5 + data: + failCommands: + - commitTransaction + errorLabels: + - RetryableError + - SystemOverloadedError + errorCode: 112 + - object: *session0 + name: startTransaction + - object: *collection0 + name: insertOne + arguments: + session: *session0 + document: + _id: 1 + expectResult: + $$unsetOrMatches: + insertedId: + $$unsetOrMatches: 1 + - object: *session0 + name: commitTransaction + expectEvents: + - client: *client0 + events: + - commandStartedEvent: + command: + insert: test + documents: + - _id: 1 + ordered: true + readConcern: + $$exists: false + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: true + autocommit: false + writeConcern: + $$exists: false + commandName: insert + databaseName: *database_name + - commandStartedEvent: + command: + commitTransaction: 1 + lsid: + $$sessionLsid: *session0 + txnNumber: + $numberLong: "1" + startTransaction: + $$exists: false + autocommit: false + writeConcern: + $$exists: false + commandName: commitTransaction + databaseName: admin + - commandStartedEvent: + commandName: commitTransaction + - commandStartedEvent: + commandName: commitTransaction + - commandStartedEvent: + commandName: commitTransaction + - commandStartedEvent: + commandName: commitTransaction + - commandStartedEvent: + commandName: commitTransaction + outcome: + - collectionName: *collection_name + databaseName: *database_name + documents: + - _id: 1 diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.json b/source/transactions/tests/unified/backpressure-retryable-reads.json index 69388cf887..337cd6de00 100644 --- a/source/transactions/tests/unified/backpressure-retryable-reads.json +++ b/source/transactions/tests/unified/backpressure-retryable-reads.json @@ -3,7 +3,7 @@ "schemaVersion": "1.3", "runOnRequirements": [ { - "minServerVersion": "4.1.8", + "minServerVersion": "4.4", "topologies": [ "replicaset", "sharded", @@ -337,82 +337,6 @@ "documents": [] } ] - }, - { - "description": "retry fails if backpressure labels are added to the first operation in a transaction", - "operations": [ - { - "object": "session0", - "name": "startTransaction" - }, - { - "object": "testRunner", - "name": "failPoint", - "arguments": { - "client": "client0", - "failPoint": { - "configureFailPoint": "failCommand", - "mode": { - "times": 1 - }, - "data": { - "failCommands": [ - "find" - ], - "errorLabels": [ - "RetryableError", - "SystemOverloadedError" - ], - "errorCode": 112 - } - } - } - }, - { - "object": "collection0", - "name": "find", - "arguments": { - "filter": {}, - "session": "session0" - }, - "expectError": { - "isError": true - } - }, - { - "object": "session0", - "name": "abortTransaction" - } - ], - "expectEvents": [ - { - "client": "client0", - "events": [ - { - "commandStartedEvent": { - "commandName": "find" - } - }, - { - "commandStartedEvent": { - "commandName": "find" - } - }, - { - "commandStartedEvent": { - "commandName": "abortTransaction" - } - } - ] - } - ], - "outcome": [ - { - "collectionName": "test", - "databaseName": "transaction-tests", - "documents": [] - } - ] } ] } diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.yml b/source/transactions/tests/unified/backpressure-retryable-reads.yml index 79cf65d642..9e3e44d1ed 100644 --- a/source/transactions/tests/unified/backpressure-retryable-reads.yml +++ b/source/transactions/tests/unified/backpressure-retryable-reads.yml @@ -1,31 +1,37 @@ description: backpressure-retryable-reads schemaVersion: "1.3" runOnRequirements: - - minServerVersion: 4.1.8 + - minServerVersion: "4.4" topologies: - replicaset - sharded - load-balanced createEntities: - - client: + - + client: id: &client0 client0 useMultipleMongoses: false observeEvents: - commandStartedEvent - - database: + - + database: id: &database0 database0 client: *client0 - databaseName: &databaseName transaction-tests - - collection: + databaseName: &database_name transaction-tests + - + collection: id: &collection0 collection0 database: *database0 - collectionName: &collectionName test - - session: + collectionName: &collection_name test + - + session: id: &session0 session0 client: *client0 + initialData: - - collectionName: *collectionName - databaseName: *databaseName + - + collectionName: *collection_name + databaseName: *database_name documents: [] tests: - description: reads are retried if backpressure labels are added @@ -84,7 +90,7 @@ tests: writeConcern: $$exists: false commandName: insert - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: find: test @@ -98,7 +104,7 @@ tests: writeConcern: $$exists: false commandName: find - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: find: test @@ -112,7 +118,7 @@ tests: writeConcern: $$exists: false commandName: find - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: abortTransaction: @@ -129,8 +135,8 @@ tests: commandName: commitTransaction databaseName: admin outcome: - - collectionName: *collectionName - databaseName: *databaseName + - collectionName: *collection_name + databaseName: *database_name documents: [] - description: reads are retried maxAttempts=5 times if backpressure labels are added operations: @@ -189,47 +195,6 @@ tests: - commandStartedEvent: commandName: abortTransaction outcome: - - collectionName: *collectionName - databaseName: *databaseName - documents: [] - - description: retry fails if backpressure labels are added to the first operation in a transaction - operations: - - object: *session0 - name: startTransaction - - object: testRunner - name: failPoint - arguments: - client: *client0 - failPoint: - configureFailPoint: failCommand - mode: - times: 1 - data: - failCommands: - - find - errorLabels: - - RetryableError - - SystemOverloadedError - errorCode: 112 - - object: *collection0 - name: find - arguments: - filter: {} - session: *session0 - expectError: - isError: true - - object: *session0 - name: abortTransaction - expectEvents: - - client: *client0 - events: - - commandStartedEvent: - commandName: find - - commandStartedEvent: - commandName: find - - commandStartedEvent: - commandName: abortTransaction - outcome: - - collectionName: *collectionName - databaseName: *databaseName + - collectionName: *collection_name + databaseName: *database_name documents: [] diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.json b/source/transactions/tests/unified/backpressure-retryable-writes.json index 9525d07e9c..628dbd44ea 100644 --- a/source/transactions/tests/unified/backpressure-retryable-writes.json +++ b/source/transactions/tests/unified/backpressure-retryable-writes.json @@ -3,7 +3,7 @@ "schemaVersion": "1.3", "runOnRequirements": [ { - "minServerVersion": "4.1.8", + "minServerVersion": "4.4", "topologies": [ "replicaset", "sharded", @@ -355,7 +355,7 @@ ] }, { - "description": "retry fails if backpressure labels are added to the first operation in a transaction", + "description": "retry succeeds if backpressure labels are added to the first operation in a transaction", "operations": [ { "object": "session0", @@ -392,9 +392,6 @@ "document": { "_id": 2 } - }, - "expectError": { - "isError": true } }, { diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.yml b/source/transactions/tests/unified/backpressure-retryable-writes.yml index b3f68712d7..5a463f8b48 100644 --- a/source/transactions/tests/unified/backpressure-retryable-writes.yml +++ b/source/transactions/tests/unified/backpressure-retryable-writes.yml @@ -1,31 +1,37 @@ description: backpressure-retryable-writes schemaVersion: "1.3" runOnRequirements: - - minServerVersion: 4.1.8 + - minServerVersion: "4.4" topologies: - replicaset - sharded - load-balanced createEntities: - - client: + - + client: id: &client0 client0 useMultipleMongoses: false observeEvents: - commandStartedEvent - - database: + - + database: id: &database0 database0 client: *client0 - databaseName: &databaseName transaction-tests - - collection: + databaseName: &database_name transaction-tests + - + collection: id: &collection0 collection0 database: *database0 - collectionName: &collectionName test - - session: + collectionName: &collection_name test + - + session: id: &session0 session0 client: *client0 + initialData: - - collectionName: *collectionName - databaseName: *databaseName + - + collectionName: *collection_name + databaseName: *database_name documents: [] tests: - description: writes are retried if backpressure labels are added @@ -85,7 +91,7 @@ tests: writeConcern: $$exists: false commandName: insert - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: insert: test @@ -102,7 +108,7 @@ tests: writeConcern: $$exists: false commandName: insert - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: insert: test @@ -119,7 +125,7 @@ tests: writeConcern: $$exists: false commandName: insert - databaseName: *databaseName + databaseName: *database_name - commandStartedEvent: command: abortTransaction: @@ -136,8 +142,8 @@ tests: commandName: commitTransaction databaseName: admin outcome: - - collectionName: *collectionName - databaseName: *databaseName + - collectionName: *collection_name + databaseName: *database_name documents: [] - description: writes are retried maxAttempts=5 times if backpressure labels are added operations: @@ -197,10 +203,10 @@ tests: - commandStartedEvent: commandName: abortTransaction outcome: - - collectionName: *collectionName - databaseName: *databaseName + - collectionName: *collection_name + databaseName: *database_name documents: [] - - description: retry fails if backpressure labels are added to the first operation in a transaction + - description: retry succeeds if backpressure labels are added to the first operation in a transaction operations: - object: *session0 name: startTransaction @@ -225,8 +231,6 @@ tests: session: *session0 document: _id: 2 - expectError: - isError: true - object: *session0 name: abortTransaction expectEvents: @@ -239,6 +243,6 @@ tests: - commandStartedEvent: commandName: abortTransaction outcome: - - collectionName: *collectionName - databaseName: *databaseName + - collectionName: *collection_name + databaseName: *database_name documents: [] diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 9c812058db..af61b51dcf 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -578,11 +578,12 @@ all preceding commands in the transaction. All commands in a transaction are subject to the [Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly when -the appropriate error labels are added by the server. This includes the `startTransaction`, `abortTransaction`, -`commitTransaction` commands as well as any read or write commands attempted during the transaction. +the appropriate error labels are added by the server. This includes the initial command with `startTransaction` set, the +`abortTransaction` and `commitTransaction` commands, as well as any read or write commands attempted during the +transaction. -In the case that the first command in a transaction has backpressure applied, it will eventually fail because the server -will not have started a transaction. +If a command fails with backpressure labels and it has `startTransaction` field set to `true`, the retried command MUST +also set `startTransaction` to `true`. ### **Server Commands** From 88e60671e6f38c319a4e505865829f666289fd8f Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Wed, 17 Dec 2025 20:34:09 -0600 Subject: [PATCH 25/55] verify commitTransaction fails after 5 backoff attempts --- .../unified/backpressure-retryable-commit.json | 15 ++++++--------- .../unified/backpressure-retryable-commit.yml | 8 ++++---- 2 files changed, 10 insertions(+), 13 deletions(-) diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.json b/source/transactions/tests/unified/backpressure-retryable-commit.json index 9146578564..1da37364f1 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.json +++ b/source/transactions/tests/unified/backpressure-retryable-commit.json @@ -238,9 +238,7 @@ "client": "client0", "failPoint": { "configureFailPoint": "failCommand", - "mode": { - "times": 5 - }, + "mode": "alwaysOn", "data": { "failCommands": [ "commitTransaction" @@ -277,7 +275,10 @@ }, { "object": "session0", - "name": "commitTransaction" + "name": "commitTransaction", + "expectError": { + "isError": true + } } ], "expectEvents": [ @@ -367,11 +368,7 @@ { "collectionName": "test", "databaseName": "transaction-tests", - "documents": [ - { - "_id": 1 - } - ] + "documents": [] } ] } diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.yml b/source/transactions/tests/unified/backpressure-retryable-commit.yml index 99a0a5cb4a..d0b0f8df14 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.yml +++ b/source/transactions/tests/unified/backpressure-retryable-commit.yml @@ -147,8 +147,7 @@ tests: client: *client0 failPoint: configureFailPoint: failCommand - mode: - times: 5 + mode: alwaysOn data: failCommands: - commitTransaction @@ -170,6 +169,8 @@ tests: $$unsetOrMatches: 1 - object: *session0 name: commitTransaction + expectError: + isError: true expectEvents: - client: *client0 events: @@ -218,5 +219,4 @@ tests: outcome: - collectionName: *collection_name databaseName: *database_name - documents: - - _id: 1 + documents: [] From 2019678ae676710af18239b47dc4d53a6b6d721e Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Thu, 18 Dec 2025 06:26:41 -0600 Subject: [PATCH 26/55] clean up transactions spec --- source/transactions/transactions.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index af61b51dcf..d4514e9ffb 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -559,10 +559,11 @@ a transaction. In MongoDB 4.0 the only supported retryable write commands within a transaction are commitTransaction and abortTransaction. Therefore drivers MUST NOT retry write commands within transactions even when retryWrites has been -enabled on the MongoClient, unless the command has backpressure error labels applied. In addition, drivers MUST NOT add -the RetryableWriteError label to any error that occurs during a write command within a transaction (excepting -commitTransation and abortTransaction), even when retryWrites has been enabled on the MongoClient, unless the command -has backpressure error labels applied. +enabled on the MongoClient, unless the server response has backpressure error labels applied. + +In addition, drivers MUST NOT add the RetryableWriteError label to any error that occurs during a write command within a +transaction (excepting commitTransation and abortTransaction), even when retryWrites has been enabled on the +MongoClient, unless the server response has backpressure error labels applied. Drivers MUST retry the commitTransaction and abortTransaction commands even when retryWrites has been disabled on the MongoClient. commitTransaction and abortTransaction are retryable write commands and MUST be retried according to the @@ -578,12 +579,15 @@ all preceding commands in the transaction. All commands in a transaction are subject to the [Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly when -the appropriate error labels are added by the server. This includes the initial command with `startTransaction` set, the -`abortTransaction` and `commitTransaction` commands, as well as any read or write commands attempted during the +the appropriate error labels are added by the server. This includes the initial command with `startTransaction:true`, +the `abortTransaction` and `commitTransaction` commands, as well as any read or write commands attempted during the transaction. -If a command fails with backpressure labels and it has `startTransaction` field set to `true`, the retried command MUST -also set `startTransaction` to `true`. +If a command fails with backpressure labels and it includes `startTransaction:true`, the retried command MUST also +include `startTransaction:true`. + +If a command fails backpressure retries `MAX_ATTEMPTS` times, it MUST not be retried again, including the +`commitTransaction` command. ### **Server Commands** From d3ce32b398ee3ab11e18d3134effdd67b856d240 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Thu, 18 Dec 2025 09:43:39 -0600 Subject: [PATCH 27/55] update test names --- .../tests/unified/backpressure-retryable-abort.json | 2 +- .../transactions/tests/unified/backpressure-retryable-abort.yml | 2 +- .../tests/unified/backpressure-retryable-commit.json | 2 +- .../tests/unified/backpressure-retryable-commit.yml | 2 +- 4 files changed, 4 insertions(+), 4 deletions(-) diff --git a/source/transactions/tests/unified/backpressure-retryable-abort.json b/source/transactions/tests/unified/backpressure-retryable-abort.json index 7dde45f76e..53fc9c6f09 100644 --- a/source/transactions/tests/unified/backpressure-retryable-abort.json +++ b/source/transactions/tests/unified/backpressure-retryable-abort.json @@ -1,5 +1,5 @@ { - "description": "retryable-abort", + "description": "backpressure-retryable-abort", "schemaVersion": "1.3", "runOnRequirements": [ { diff --git a/source/transactions/tests/unified/backpressure-retryable-abort.yml b/source/transactions/tests/unified/backpressure-retryable-abort.yml index 35c6c23a90..85532e1f60 100644 --- a/source/transactions/tests/unified/backpressure-retryable-abort.yml +++ b/source/transactions/tests/unified/backpressure-retryable-abort.yml @@ -1,4 +1,4 @@ -description: retryable-abort +description: backpressure-retryable-abort schemaVersion: "1.3" runOnRequirements: - minServerVersion: "4.4" diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.json b/source/transactions/tests/unified/backpressure-retryable-commit.json index 1da37364f1..b813c42938 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.json +++ b/source/transactions/tests/unified/backpressure-retryable-commit.json @@ -1,5 +1,5 @@ { - "description": "retryable-commit", + "description": "backpressure-retryable-commit", "schemaVersion": "1.4", "runOnRequirements": [ { diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.yml b/source/transactions/tests/unified/backpressure-retryable-commit.yml index d0b0f8df14..c8e4a20c2d 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.yml +++ b/source/transactions/tests/unified/backpressure-retryable-commit.yml @@ -1,4 +1,4 @@ -description: retryable-commit +description: backpressure-retryable-commit schemaVersion: "1.4" runOnRequirements: - minServerVersion: "4.4" From de7e862e1867bd064161d966e05fdb3413422514 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Thu, 18 Dec 2025 09:58:32 -0600 Subject: [PATCH 28/55] address writeconcern on retries --- .../tests/unified/backpressure-retryable-commit.json | 6 ++---- .../tests/unified/backpressure-retryable-commit.yml | 6 ++---- source/transactions/transactions.md | 3 ++- 3 files changed, 6 insertions(+), 9 deletions(-) diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.json b/source/transactions/tests/unified/backpressure-retryable-commit.json index b813c42938..ae873561a9 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.json +++ b/source/transactions/tests/unified/backpressure-retryable-commit.json @@ -177,8 +177,7 @@ }, "autocommit": false, "writeConcern": { - "w": "majority", - "wtimeout": 10000 + "$$exists": false } }, "commandName": "commitTransaction", @@ -200,8 +199,7 @@ }, "autocommit": false, "writeConcern": { - "w": "majority", - "wtimeout": 10000 + "$$exists": false } }, "commandName": "commitTransaction", diff --git a/source/transactions/tests/unified/backpressure-retryable-commit.yml b/source/transactions/tests/unified/backpressure-retryable-commit.yml index c8e4a20c2d..8099e1c1eb 100644 --- a/source/transactions/tests/unified/backpressure-retryable-commit.yml +++ b/source/transactions/tests/unified/backpressure-retryable-commit.yml @@ -113,8 +113,7 @@ tests: $$exists: false autocommit: false writeConcern: - w: majority - wtimeout: 10000 + $$exists: false commandName: commitTransaction databaseName: admin - commandStartedEvent: @@ -128,8 +127,7 @@ tests: $$exists: false autocommit: false writeConcern: - w: majority - wtimeout: 10000 + $$exists: false commandName: commitTransaction databaseName: admin outcome: diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index d4514e9ffb..4d8cc197fd 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -1062,7 +1062,8 @@ transaction. ### Majority write concern is used when retrying commitTransaction Drivers should apply a majority write concern when retrying commitTransaction to guard against a transaction being -applied twice. +applied twice. Note that this does not apply when retrying commitTransaction after a backpressure retry, since we are +sure that the transaction has not been applied. Consider the following scenario: From ab85a5bf30439cac0d91d6830b884a7b31873d56 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 18 Dec 2025 08:02:03 -0700 Subject: [PATCH 29/55] add retryable get more tests --- .../tests/getMore-retried.json | 286 ++++++++++++++++++ .../tests/getMore-retried.yml | 147 +++++++++ 2 files changed, 433 insertions(+) create mode 100644 source/client-backpressure/tests/getMore-retried.json create mode 100644 source/client-backpressure/tests/getMore-retried.yml diff --git a/source/client-backpressure/tests/getMore-retried.json b/source/client-backpressure/tests/getMore-retried.json new file mode 100644 index 0000000000..f60ad4187d --- /dev/null +++ b/source/client-backpressure/tests/getMore-retried.json @@ -0,0 +1,286 @@ +{ + "description": "getMore-retries-backpressure", + "schemaVersion": "1.3", + "createEntities": [ + { + "client": { + "id": "client0", + "observeEvents": [ + "commandStartedEvent", + "commandFailedEvent", + "commandSucceededEvent" + ] + } + }, + { + "client": { + "id": "failPointClient" + } + }, + { + "database": { + "id": "db", + "client": "client0", + "databaseName": "default" + } + }, + { + "collection": { + "id": "coll", + "database": "db", + "collectionName": "default" + } + } + ], + "initialData": [ + { + "databaseName": "default", + "collectionName": "default", + "documents": [ + { + "a": 1 + }, + { + "a": 2 + }, + { + "a": 3 + } + ] + } + ], + "tests": [ + { + "description": "getMores are retried", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": { + "times": 3 + }, + "data": { + "failCommands": [ + "getMore" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "name": "find", + "arguments": { + "batchSize": 2, + "filter": {}, + "sort": { + "a": 1 + } + }, + "object": "coll", + "expectResult": [ + { + "a": 1 + }, + { + "a": 2 + }, + { + "a": 3 + } + ] + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandSucceededEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandSucceededEvent": { + "commandName": "getMore" + } + } + ] + } + ] + }, + { + "description": "getMores are retried maxAttempts=5 times", + "operations": [ + { + "name": "failPoint", + "object": "testRunner", + "arguments": { + "client": "failPointClient", + "failPoint": { + "configureFailPoint": "failCommand", + "mode": "alwaysOn", + "data": { + "failCommands": [ + "getMore" + ], + "errorLabels": [ + "RetryableError", + "SystemOverloadedError" + ], + "errorCode": 2 + } + } + } + }, + { + "name": "find", + "arguments": { + "batchSize": 2, + "filter": {} + }, + "object": "coll", + "expectError": { + "isError": true, + "isClientError": false + } + } + ], + "expectEvents": [ + { + "client": "client0", + "events": [ + { + "commandStartedEvent": { + "commandName": "find" + } + }, + { + "commandSucceededEvent": { + "commandName": "find" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "getMore" + } + }, + { + "commandFailedEvent": { + "commandName": "getMore" + } + }, + { + "commandStartedEvent": { + "commandName": "killCursors" + } + }, + { + "commandSucceededEvent": { + "commandName": "killCursors" + } + } + ] + } + ] + } + ] +} diff --git a/source/client-backpressure/tests/getMore-retried.yml b/source/client-backpressure/tests/getMore-retried.yml new file mode 100644 index 0000000000..aaef033500 --- /dev/null +++ b/source/client-backpressure/tests/getMore-retried.yml @@ -0,0 +1,147 @@ +description: getMore-retries-backpressure +schemaVersion: "1.3" + +createEntities: + - client: + id: &client client0 + observeEvents: + - commandStartedEvent + - commandFailedEvent + - commandSucceededEvent + - client: + id: &failPointClient failPointClient + - database: + id: db + client: *client + databaseName: &dbName default + - collection: + id: &collection coll + database: db + collectionName: &collectionName default +initialData: + - databaseName: *dbName + collectionName: *collectionName + documents: + - { a: 1 } + - { a: 2 } + - { a: 3 } + +tests: + - description: "getMores are retried" + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: { times: 3 } + data: + failCommands: [getMore] + errorLabels: [RetryableError, SystemOverloadedError] + errorCode: 2 + + - name: find + arguments: + # batch size of 2 with 3 docs in the collection ensures exactly one find + one getMore exhaust the cursor + batchSize: 2 + filter: {} + # ensure stable ordering of result documents + sort: { a: 1 } + object: *collection + expectResult: + - { a: 1 } + - { a: 2 } + - { a: 3 } + expectEvents: + - client: *client + events: + - commandStartedEvent: + commandName: find + - commandSucceededEvent: + commandName: find + # first attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # second attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # third attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # success + - commandStartedEvent: + commandName: getMore + - commandSucceededEvent: + commandName: getMore + + - description: "getMores are retried maxAttempts=5 times" + operations: + - name: failPoint + object: testRunner + arguments: + client: *failPointClient + failPoint: + configureFailPoint: failCommand + mode: alwaysOn + data: + failCommands: [getMore] + errorLabels: [RetryableError, SystemOverloadedError] + errorCode: 2 + + - name: find + arguments: + batchSize: 2 + filter: {} + object: *collection + expectError: + isError: true + isClientError: false + + expectEvents: + - client: *client + events: + - commandStartedEvent: + commandName: find + - commandSucceededEvent: + commandName: find + # first attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # second attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # third attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # fourth attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # fifth attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + # final attempt + - commandStartedEvent: + commandName: getMore + - commandFailedEvent: + commandName: getMore + - commandStartedEvent: + commandName: killCursors + - commandSucceededEvent: + commandName: killCursors \ No newline at end of file From eb10ddbda64cc8e01f34d40747df0fa27a999946 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 18 Dec 2025 12:12:46 -0700 Subject: [PATCH 30/55] transaction test fixes --- .../unified/backpressure-retryable-reads.json | 14 -------------- .../tests/unified/backpressure-retryable-reads.yml | 8 -------- .../unified/backpressure-retryable-writes.json | 9 ++++++++- .../unified/backpressure-retryable-writes.yml | 4 +++- 4 files changed, 11 insertions(+), 24 deletions(-) diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.json b/source/transactions/tests/unified/backpressure-retryable-reads.json index 337cd6de00..731762830e 100644 --- a/source/transactions/tests/unified/backpressure-retryable-reads.json +++ b/source/transactions/tests/unified/backpressure-retryable-reads.json @@ -213,13 +213,6 @@ } ] } - ], - "outcome": [ - { - "collectionName": "test", - "databaseName": "transaction-tests", - "documents": [] - } ] }, { @@ -329,13 +322,6 @@ } ] } - ], - "outcome": [ - { - "collectionName": "test", - "databaseName": "transaction-tests", - "documents": [] - } ] } ] diff --git a/source/transactions/tests/unified/backpressure-retryable-reads.yml b/source/transactions/tests/unified/backpressure-retryable-reads.yml index 9e3e44d1ed..18bbdaadbf 100644 --- a/source/transactions/tests/unified/backpressure-retryable-reads.yml +++ b/source/transactions/tests/unified/backpressure-retryable-reads.yml @@ -134,10 +134,6 @@ tests: $$exists: false commandName: commitTransaction databaseName: admin - outcome: - - collectionName: *collection_name - databaseName: *database_name - documents: [] - description: reads are retried maxAttempts=5 times if backpressure labels are added operations: - object: *session0 @@ -194,7 +190,3 @@ tests: commandName: find - commandStartedEvent: commandName: abortTransaction - outcome: - - collectionName: *collection_name - databaseName: *database_name - documents: [] diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.json b/source/transactions/tests/unified/backpressure-retryable-writes.json index 628dbd44ea..0817e03f2f 100644 --- a/source/transactions/tests/unified/backpressure-retryable-writes.json +++ b/source/transactions/tests/unified/backpressure-retryable-writes.json @@ -232,7 +232,14 @@ { "collectionName": "test", "databaseName": "transaction-tests", - "documents": [] + "documents": [ + { + "_id": 1 + }, + { + "_id": 2 + } + ] } ] }, diff --git a/source/transactions/tests/unified/backpressure-retryable-writes.yml b/source/transactions/tests/unified/backpressure-retryable-writes.yml index 5a463f8b48..630c9d9694 100644 --- a/source/transactions/tests/unified/backpressure-retryable-writes.yml +++ b/source/transactions/tests/unified/backpressure-retryable-writes.yml @@ -144,7 +144,9 @@ tests: outcome: - collectionName: *collection_name databaseName: *database_name - documents: [] + documents: + - { _id: 1 } + - { _id: 2 } - description: writes are retried maxAttempts=5 times if backpressure labels are added operations: - object: *session0 From 2b9069760997b98bf9c35cabb406ebf1132a44d3 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 18 Dec 2025 13:52:23 -0700 Subject: [PATCH 31/55] deduplicate ids --- source/client-backpressure/tests/backpressure-retry-loop.json | 4 ++-- source/client-backpressure/tests/backpressure-retry-loop.yml | 2 +- .../tests/backpressure-retry-loop.yml.template | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index c4aab441a3..f9253774b3 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -31,7 +31,7 @@ }, { "database": { - "id": "database", + "id": "internal_db", "client": "internal_client", "databaseName": "retryable-writes-tests" } @@ -39,7 +39,7 @@ { "collection": { "id": "retryable-writes-tests", - "database": "database", + "database": "internal_db", "collectionName": "coll" } }, diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 0112330fcf..8e1aa7f00e 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -23,7 +23,7 @@ createEntities: - database: - id: &internal_db database + id: &internal_db internal_db client: *internal_client databaseName: &database_name retryable-writes-tests diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index f83f462e8a..44e0bdf99a 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -23,7 +23,7 @@ createEntities: - database: - id: &internal_db database + id: &internal_db internal_db client: *internal_client databaseName: &database_name retryable-writes-tests From 0abf37351a9c198be47550db5b08f983c09637bb Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Thu, 18 Dec 2025 15:33:47 -0600 Subject: [PATCH 32/55] update transaction writeconcern logic and add changelog entry --- source/transactions/transactions.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 4d8cc197fd..a375181cc5 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -1061,9 +1061,11 @@ transaction. ### Majority write concern is used when retrying commitTransaction -Drivers should apply a majority write concern when retrying commitTransaction to guard against a transaction being -applied twice. Note that this does not apply when retrying commitTransaction after a backpressure retry, since we are -sure that the transaction has not been applied. +Drivers SHOULD apply a majority write concern when retrying commitTransaction to guard against a transaction being +applied twice. + +Drivers SHOULD NOT modify the write concern on commit transaction commands when retrying a backpressure error, since we +are sure that the transaction has not been applied. Consider the following scenario: @@ -1108,6 +1110,8 @@ objective of avoiding duplicate commits. ## **Changelog** +- 2025-12-18: Specify the handling of client backpressure. + - 2024-11-01: Clarify collection options inside txn. - 2024-11-01: Specify that ClientSession must be unpinned when ended. From d07d49ec674358f07360787a57654e45fc61f749 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 18 Dec 2025 15:50:19 -0700 Subject: [PATCH 33/55] last few comments --- .../client-backpressure.md | 2 +- source/client-backpressure/tests/README.md | 6 +- .../tests/backpressure-retry-loop.json | 231 +----------------- .../tests/backpressure-retry-loop.yml | 227 +++-------------- .../backpressure-retry-loop.yml.template | 11 +- 5 files changed, 50 insertions(+), 427 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index a8eeb714bb..0fbba54e9f 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -100,7 +100,7 @@ collection, getMore, and generic runCommand. The new command execution method ob - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. -5. If a retry attempt is to be attempted, a token will be consumed from the token bucket. +5. A retry attempt consumes 1 token from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` - `i` is the retry attempt number (starting with 0 for the first retry). diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md index b4b03085e6..9f49b1ed20 100644 --- a/source/client-backpressure/tests/README.md +++ b/source/client-backpressure/tests/README.md @@ -35,7 +35,7 @@ Drivers should test that retries do not occur immediately when a SystemOverloade } ``` - 3. Execute the document `{ a: 1 }`. Expect that the command errors. Measure the duration of the command execution. + 3. Insert the document `{ a: 1 }`. Expect that the command errors. Measure the duration of the command execution. ```javascript const start = performance.now(); @@ -55,7 +55,3 @@ Drivers should test that retries do not occur immediately when a SystemOverloade ``` The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two runs. - -## Changelog - -- 2025-XX-XX: Initial version. diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index f9253774b3..2542344b38 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -58,6 +58,13 @@ } } ], + "initialData": [ + { + "collectionName": "coll", + "databaseName": "retryable-writes-tests", + "documents": [] + } + ], "_yamlAnchors": { "bulWriteInsertNamespace": "retryable-writes-tests.coll" }, @@ -65,13 +72,6 @@ { "description": "client.listDatabases retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -154,13 +154,6 @@ { "description": "client.listDatabaseNames retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -240,13 +233,6 @@ { "description": "client.createChangeStream retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -334,13 +320,6 @@ } ], "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -433,13 +412,6 @@ { "description": "database.aggregate retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -529,13 +501,6 @@ { "description": "database.listCollections retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -618,13 +583,6 @@ { "description": "database.listCollectionNames retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -707,13 +665,6 @@ { "description": "database.runCommand retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -799,13 +750,6 @@ { "description": "database.createChangeStream retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -888,13 +832,6 @@ { "description": "collection.aggregate retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -977,13 +914,6 @@ { "description": "collection.countDocuments retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1066,13 +996,6 @@ { "description": "collection.estimatedDocumentCount retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1152,13 +1075,6 @@ { "description": "collection.distinct retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1242,13 +1158,6 @@ { "description": "collection.find retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1331,13 +1240,6 @@ { "description": "collection.findOne retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1420,13 +1322,6 @@ { "description": "collection.listIndexes retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1506,13 +1401,6 @@ { "description": "collection.listIndexNames retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1592,13 +1480,6 @@ { "description": "collection.createChangeStream retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1681,13 +1562,6 @@ { "description": "collection.insertOne retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1773,13 +1647,6 @@ { "description": "collection.insertMany retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1867,13 +1734,6 @@ { "description": "collection.deleteOne retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -1956,13 +1816,6 @@ { "description": "collection.deleteMany retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2045,13 +1898,6 @@ { "description": "collection.replaceOne retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2137,13 +1983,6 @@ { "description": "collection.updateOne retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2231,13 +2070,6 @@ { "description": "collection.updateMany retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2325,13 +2157,6 @@ { "description": "collection.findOneAndDelete retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2414,13 +2239,6 @@ { "description": "collection.findOneAndReplace retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2506,13 +2324,6 @@ { "description": "collection.findOneAndUpdate retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2600,13 +2411,6 @@ { "description": "collection.bulkWrite retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2698,13 +2502,6 @@ { "description": "collection.createIndex retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", @@ -2790,13 +2587,6 @@ { "description": "collection.dropIndex retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "object": "retryable-writes-tests", "name": "createIndex", @@ -2889,13 +2679,6 @@ { "description": "collection.dropIndexes retries using operation loop", "operations": [ - { - "object": "retryable-writes-tests", - "name": "deleteMany", - "arguments": { - "filter": {} - } - }, { "name": "failPoint", "object": "testRunner", diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 8e1aa7f00e..6a3033989b 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -45,6 +45,11 @@ createEntities: database: *database collectionName: *collection_name +initialData: +- collectionName: *collection_name + databaseName: *database_name + documents: [] + _yamlAnchors: bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll @@ -52,12 +57,7 @@ tests: - description: 'client.listDatabases retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -100,12 +100,7 @@ tests: - description: 'client.listDatabaseNames retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -146,12 +141,7 @@ tests: - description: 'client.createChangeStream retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -196,12 +186,7 @@ tests: description: 'client.clientBulkWrite retries using operation loop' runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -247,12 +232,7 @@ tests: - description: 'database.aggregate retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -295,12 +275,7 @@ tests: - description: 'database.listCollections retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -343,12 +318,7 @@ tests: - description: 'database.listCollectionNames retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -391,12 +361,7 @@ tests: - description: 'database.runCommand retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -440,12 +405,7 @@ tests: - description: 'database.createChangeStream retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -488,12 +448,7 @@ tests: - description: 'collection.aggregate retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -536,12 +491,7 @@ tests: - description: 'collection.countDocuments retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -584,12 +534,7 @@ tests: - description: 'collection.estimatedDocumentCount retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -630,12 +575,7 @@ tests: - description: 'collection.distinct retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -679,12 +619,7 @@ tests: - description: 'collection.find retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -727,12 +662,7 @@ tests: - description: 'collection.findOne retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -775,12 +705,7 @@ tests: - description: 'collection.listIndexes retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -821,12 +746,7 @@ tests: - description: 'collection.listIndexNames retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -867,12 +787,7 @@ tests: - description: 'collection.createChangeStream retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -915,12 +830,7 @@ tests: - description: 'collection.insertOne retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -963,12 +873,7 @@ tests: - description: 'collection.insertMany retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1012,12 +917,7 @@ tests: - description: 'collection.deleteOne retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1060,12 +960,7 @@ tests: - description: 'collection.deleteMany retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1108,12 +1003,7 @@ tests: - description: 'collection.replaceOne retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1157,12 +1047,7 @@ tests: - description: 'collection.updateOne retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1206,12 +1091,7 @@ tests: - description: 'collection.updateMany retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1255,12 +1135,7 @@ tests: - description: 'collection.findOneAndDelete retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1303,12 +1178,7 @@ tests: - description: 'collection.findOneAndReplace retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1352,12 +1222,7 @@ tests: - description: 'collection.findOneAndUpdate retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1401,12 +1266,7 @@ tests: - description: 'collection.bulkWrite retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1451,12 +1311,7 @@ tests: - description: 'collection.createIndex retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint @@ -1501,11 +1356,6 @@ tests: - description: 'collection.dropIndex retries using operation loop' operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} - object: *internal_collection name: createIndex @@ -1554,12 +1404,7 @@ tests: - description: 'collection.dropIndexes retries using operation loop' - operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} + operations: - name: failPoint diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index 44e0bdf99a..df1afe95cf 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -45,6 +45,11 @@ createEntities: database: *database collectionName: *collection_name +initialData: +- collectionName: *collection_name + databaseName: *database_name + documents: [] + _yamlAnchors: bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll @@ -57,12 +62,6 @@ tests: - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} operations: - - - object: *internal_collection - name: deleteMany - arguments: - filter: {} - {%- if operation.operation_name == "dropIndex" %} - object: *internal_collection From 747c18cf1ad6ebc3a6aaee9900e289c38498440d Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Fri, 19 Dec 2025 06:21:00 -0600 Subject: [PATCH 34/55] add runOnRequirement for getMore tests --- source/client-backpressure/tests/getMore-retried.json | 5 +++++ source/client-backpressure/tests/getMore-retried.yml | 4 +++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/source/client-backpressure/tests/getMore-retried.json b/source/client-backpressure/tests/getMore-retried.json index f60ad4187d..70eff84612 100644 --- a/source/client-backpressure/tests/getMore-retried.json +++ b/source/client-backpressure/tests/getMore-retried.json @@ -1,6 +1,11 @@ { "description": "getMore-retries-backpressure", "schemaVersion": "1.3", + "runOnRequirements": [ + { + "minServerVersion": "4.4" + } + ], "createEntities": [ { "client": { diff --git a/source/client-backpressure/tests/getMore-retried.yml b/source/client-backpressure/tests/getMore-retried.yml index aaef033500..3a5d180aa5 100644 --- a/source/client-backpressure/tests/getMore-retried.yml +++ b/source/client-backpressure/tests/getMore-retried.yml @@ -1,6 +1,8 @@ description: getMore-retries-backpressure schemaVersion: "1.3" - +runOnRequirements: + - + minServerVersion: '4.4' # failCommand createEntities: - client: id: &client client0 From be27bc081fde81deeb4c3f971bd929e790ea988a Mon Sep 17 00:00:00 2001 From: bailey Date: Fri, 19 Dec 2025 12:00:59 -0700 Subject: [PATCH 35/55] updated formula --- source/client-backpressure/client-backpressure.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 0fbba54e9f..c854e9cf90 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -102,8 +102,8 @@ collection, getMore, and generic runCommand. The new command execution method ob timeout to avoid retry storms. 5. A retry attempt consumes 1 token from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to - the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^i)` - - `i` is the retry attempt number (starting with 0 for the first retry). + the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^(i - 1))` + - `i` is the retry attempt number (starting with 1 for the first retry). - `j` is a random jitter value between 0 and 1. - `baseBackoff` is constant 100ms. - `maxBackoff` is 10000ms. From b674f13e32617a8a1296d5c5caf3f3248dcc0249 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Mon, 22 Dec 2025 07:47:46 -0600 Subject: [PATCH 36/55] lint --- source/logging/logging.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/logging/logging.md b/source/logging/logging.md index 8971dbcde7..6f7c36c200 100644 --- a/source/logging/logging.md +++ b/source/logging/logging.md @@ -95,7 +95,7 @@ Drivers MUST support configuring where log messages should be output, including > - If the value is "stdout" (case-insensitive), log to stdout. > - If the value is "stderr" (case-insensitive), log to stderr. > - Else, if direct logging to files is supported, log to a file at the specified path. If the file already exists, it - > MUST be appended to. + > MUST be appended to. > > If the variable is not provided or is set to an invalid value (which could be invalid for any reason, e.g. the path > does not exist or is not writeable), the driver MUST log to stderr and the driver MAY attempt to warn the user about From 03065ad1738be7c5cc23ac8f690fd705d9480b69 Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Tue, 23 Dec 2025 08:55:39 -0600 Subject: [PATCH 37/55] address review --- source/client-backpressure/client-backpressure.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index c854e9cf90..730edc52e8 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -127,7 +127,10 @@ retryable reads and retryable writes specifications. Drivers MUST ensure: #### Pseudocode -The following pseudocode describes the overload retry policy: +The following pseudocode demonstrates the unified retry behavior, combining the overload retry policy defined in this +specification with the existing retry behaviors from [Retryable Reads](../retryable-reads/retryable-reads.md) and +[Retryable Writes](../retryable-reads/retryable-writes.md). For brevity, some error handling details such as the +handling of "NoWritesPerformed" are omitted. ```python # Note: the values below have been scaled down by a factor of 1000 because From 1b7f6df0a7b883b68af6e53e535c8c6e04f37bcc Mon Sep 17 00:00:00 2001 From: Steven Silvester Date: Tue, 23 Dec 2025 08:56:55 -0600 Subject: [PATCH 38/55] fix link --- source/client-backpressure/client-backpressure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 730edc52e8..bc5e95d190 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -129,7 +129,7 @@ retryable reads and retryable writes specifications. Drivers MUST ensure: The following pseudocode demonstrates the unified retry behavior, combining the overload retry policy defined in this specification with the existing retry behaviors from [Retryable Reads](../retryable-reads/retryable-reads.md) and -[Retryable Writes](../retryable-reads/retryable-writes.md). For brevity, some error handling details such as the +[Retryable Writes](../retryable-writes/retryable-writes.md). For brevity, some error handling details such as the handling of "NoWritesPerformed" are omitted. ```python From eac90fcd23b601814e6c20c3985f9d08c0d065c6 Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 5 Jan 2026 14:01:59 -0700 Subject: [PATCH 39/55] first round of comments addressed --- .../client-backpressure.md | 31 ++++++++++--------- 1 file changed, 16 insertions(+), 15 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index bc5e95d190..04317ef99c 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -94,12 +94,12 @@ collection, getMore, and generic runCommand. The new command execution method ob retry budget tracking, this counts as a success. 4. A retry attempt will only be permitted if: 1. The error has both the `SystemOverloadedError` and the `RetryableError` label. - 2. We have not reached `MAX_ATTEMPTS`. + 2. We have not reached `MAX_RETRIES`. + - The value of `MAX_RETRIES` is 5 and non-configurable. + - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within + the timeout to avoid retry storms. 3. (CSOT-only): `timeoutMS` has not expired. 4. (`SystemOverloadedError` errors only) a token can be acquired from the token bucket. - - The value of `MAX_ATTEMPTS` is 5 and non-configurable. - - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the - timeout to avoid retry storms. 5. A retry attempt consumes 1 token from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^(i - 1))` @@ -113,17 +113,18 @@ collection, getMore, and generic runCommand. The new command execution method ob #### Interaction with Existing Retry Behavior -The retryability API defined in this specification is separate from the existing retryability behaviors defined in the -retryable reads and retryable writes specifications. Drivers MUST ensure: +The retry policy in this specification is separate from the existing retryability policies defined in the +[retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) +specifications. Drivers MUST ensure: - Only retryable errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. -- Only retryable errors with the `SystemOverloadedError` label apply backoff and jitter. +- Only retryable errors with the `SystemOverloadedError` label apply backoff. - All retryable errors apply backoff if they also contain a `SystemOverloadedError` label. This includes: - - Errors defined as retryable in the retryable reads specification. - - Errors defined as retryable in the retryable writes specification. + - Errors defined as retryable in the [retryable reads specification](../retryable-reads/retryable-reads.md). + - Errors defined as retryable in the [retryable writes specification](../retryable-writes/retryable-writes.md). - Errors with the `RetryableError` label. -- Any retryable error is retried at most MAX_ATTEMPTS (default=5) times, if any attempts has failed with a - `SystemOverloadedError`. +- Any command is retried at most MAX_ATTEMPTS (default=5) times, if any attempt has failed with a + `SystemOverloadedError`, regardless of which retry policy the current or future retry attempts are caused by. #### Pseudocode @@ -136,10 +137,10 @@ handling of "NoWritesPerformed" are omitted. # Note: the values below have been scaled down by a factor of 1000 because # Python's sleep API takes a duration in seconds, not milliseconds. BASE_BACKOFF = 0.1 # 100ms -MAX_BACKOFF = 10 # 10s +MAX_BACKOFF = 10 # 10000ms RETRY_TOKEN_RETURN_RATE = 0.1 -MAX_ATTEMPTS = 5 +MAX_RETRIES = 5 def execute_command_retryable(command, ...): deprioritized_servers = [] @@ -151,7 +152,7 @@ def execute_command_retryable(command, ...): server = select_server(deprioritized_servers) connection = server.getConnection() res = execute_command(connection, command) - # Return tokens to the bucket on success. + # Deposit tokens into the bucket on success. tokens = RETRY_TOKEN_RETURN_RATE if attempt > 0: tokens += 1 @@ -171,7 +172,7 @@ def execute_command_retryable(command, ...): attempt += 1 if is_overload: - attempts = MAX_ATTEMPTS + attempts = MAX_RETRIES if attempt > attempts: raise From 0533acc212d08e8c6d08ae6d77ad4a805943889b Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 5 Jan 2026 17:59:42 -0700 Subject: [PATCH 40/55] second round of comments addressed --- .../client-backpressure.md | 31 ++++++++++--------- source/crud/bulk-write.md | 10 ++++-- source/transactions/transactions.md | 24 +++++++------- 3 files changed, 36 insertions(+), 29 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 04317ef99c..540ecb1235 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -26,8 +26,8 @@ connection spikes from overloading the system. #### Ingress Request Rate Limiter -A token bucket based system introduced in MongoDB 8.2 to admit an operation or reject it with a System Overload Error at -the front door of a mongod/s. It aims to prevent operations spikes from overloading the system. +A token bucket based system introduced in MongoDB 8.2 to admit an command or reject it with a System Overload Error at +the front door of a mongod/s. It aims to prevent command spikes from overloading the system. #### MongoTune @@ -37,8 +37,8 @@ the connection and request rate limiters to prevent and mitigate overloading the #### RetryableError label -An error is considered retryable if it includes the "RetryableError" label. This error label indicates that an operation -is safely retryable regardless of the type of operation, its metadata, or any of its arguments. +This error label indicates that an command is safely retryable regardless of the command type (read or write), its +metadata, or any of its arguments. #### SystemOverloadedError label @@ -99,11 +99,12 @@ collection, getMore, and generic runCommand. The new command execution method ob - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. 3. (CSOT-only): `timeoutMS` has not expired. - 4. (`SystemOverloadedError` errors only) a token can be acquired from the token bucket. + 4. A token can be acquired from the token bucket. 5. A retry attempt consumes 1 token from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^(i - 1))` - - `i` is the retry attempt number (starting with 1 for the first retry). + - `i` is the retry attempt number (starting with 1 for the first retry). Note that `i` includes retries for + non-overloaded errors. - `j` is a random jitter value between 0 and 1. - `baseBackoff` is constant 100ms. - `maxBackoff` is 10000ms. @@ -117,8 +118,8 @@ The retry policy in this specification is separate from the existing retryabilit [retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) specifications. Drivers MUST ensure: -- Only retryable errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. -- Only retryable errors with the `SystemOverloadedError` label apply backoff. +- Only errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. +- Only errors with the `SystemOverloadedError` label apply backoff. - All retryable errors apply backoff if they also contain a `SystemOverloadedError` label. This includes: - Errors defined as retryable in the [retryable reads specification](../retryable-reads/retryable-reads.md). - Errors defined as retryable in the [retryable writes specification](../retryable-writes/retryable-writes.md). @@ -197,9 +198,9 @@ def execute_command_retryable(command, ...): ### Token Bucket The overload retry policy introduces a per-client token bucket to limit SystemOverloaded retry attempts. Although the -server rejects excess operations as quickly as possible, doing so costs CPU and creates extra contention on the -connection pool which can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry -attempts during a prolonged overload. +server rejects excess commands as quickly as possible, doing so costs CPU and creates extra contention on the connection +pool which can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry attempts +during a prolonged overload. The token bucket capacity is set to 1000 for consistency with the server. @@ -248,7 +249,7 @@ much larger time frame. Drivers are not required to work around this limitation. ### Logging Retry Attempts [As with retryable writes](../retryable-writes/retryable-writes.md#logging-retry-attempts), drivers MAY choose to log -retry attempts for load shed operations. This specification does not define a format for such log messages. +retry attempts for load shed commands. This specification does not define a format for such log messages. ### Command Monitoring @@ -256,7 +257,7 @@ retry attempts for load shed operations. This specification does not define a fo [Command Logging and Monitoring](../command-logging-and-monitoring/command-logging-and-monitoring.md) specification, drivers MUST guarantee that each `CommandStartedEvent` has either a correlating `CommandSucceededEvent` or `CommandFailedEvent` and that every "command started" log message has either a correlating "command succeeded" log -message or "command failed" log message. If the first attempt of a retryable operation encounters a retryable error, +message or "command failed" log message. If the first attempt of a retryable command encounters a retryable error, drivers MUST fire a `CommandFailedEvent` and emit a "command failed" log message for the retryable error and fire a separate `CommandStartedEvent` and emit a separate "command started" log message when executing the subsequent retry attempt. Note that the second `CommandStartedEvent` and "command started" log message may have a different @@ -264,7 +265,7 @@ attempt. Note that the second `CommandStartedEvent` and "command started" log me ### Documentation -1. Drivers MUST document that all operations support retries on server overload. +1. Drivers MUST document that all commands support retries on server overload. 2. Driver release notes MUST make it clear to users that they may need to adjust custom retry logic to prevent an application from inadvertently retrying for too long (see [Backwards Compatibility](#backwards-compatibility) for details). @@ -320,7 +321,7 @@ sleep duration is not critical to the intended behavior, so long as we sleep at ### Why override existing maximum number of retry attempt defaults for retryable reads and writes if a `SystemOverloadedError` is received? -Load-shedded errors indicate that the request was rejected by the server to minimize load, not that the operation failed +Load-shedded errors indicate that the request was rejected by the server to minimize load, not that the command failed for logical reasons. So, when determining the number of retries an operation should attempt: - Any load-shedded errors should be retried to give them a real attempt at success diff --git a/source/crud/bulk-write.md b/source/crud/bulk-write.md index 574307e9ab..73deb2459e 100644 --- a/source/crud/bulk-write.md +++ b/source/crud/bulk-write.md @@ -835,7 +835,7 @@ Encountering a top-level error MUST halt execution of a bulk write for both orde means that drivers MUST NOT attempt to retrieve more responses from the cursor or execute any further `bulkWrite` batches and MUST immediately throw an exception. If the results cursor has not been exhausted on the server when a top-level error occurs, drivers MUST send the `killCursors` command to attempt to close it. The result returned from the -`killCursors` command MAY be ignored. +`killCursors` command MUST NOT be ignored. ### Write Concern Errors @@ -861,12 +861,16 @@ specification. ## Future Work -### Retry `bulkWrite` when `getMore` fails with a retryable error +### Retrying `getMore`s When a `getMore` fails with a retryable error when attempting to iterate the results cursor, drivers could retry the entire `bulkWrite` command to receive a fresh cursor and retry iteration. This work was omitted to minimize the scope of the initial implementation and testing of the new bulk write API, but may be revisited in the future. +Note that there is one exception to this behavior: when a command fails with an error that is eligible for retry under +the conditions defined in the [Client Backpressure](../client-backpressure/client-backpressure.md) specification, +drivers SHOULD retry the `getMore` following the rules outlined in the client backpressure specification. + ### Use document sequences for auto-encrypted bulk writes Auto-encryption does not currently support document sequences. This specification should be updated when @@ -947,6 +951,8 @@ error in this specific situation does not seem helpful enough to require size ch ## **Changelog** +- 2026-01-05: Specify that `killCursors`'s response cannot be ignored. + - 2025-09-09: Clarify that `rawData` is for internal use only. - 2025-08-13: Removed the requirement to error when QE is enabled. diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index a375181cc5..70922bd243 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -48,7 +48,7 @@ including (but not limited to) creating, updating, or deleting databases, collec An error considered retryable by the [Retryable Writes Specification](../retryable-writes/retryable-writes.md). -#### Backpressure Error +#### Retryable Backpressure Error An error considered retryable by the [Client Backpressure Specification](../client-backpressure/client-backpressure.md). @@ -559,11 +559,11 @@ a transaction. In MongoDB 4.0 the only supported retryable write commands within a transaction are commitTransaction and abortTransaction. Therefore drivers MUST NOT retry write commands within transactions even when retryWrites has been -enabled on the MongoClient, unless the server response has backpressure error labels applied. +enabled on the MongoClient, unless the server response is a retryable backpressure error. In addition, drivers MUST NOT add the RetryableWriteError label to any error that occurs during a write command within a transaction (excepting commitTransation and abortTransaction), even when retryWrites has been enabled on the -MongoClient, unless the server response has backpressure error labels applied. +MongoClient, unless the server response is a retryable backpressure error. Drivers MUST retry the commitTransaction and abortTransaction commands even when retryWrites has been disabled on the MongoClient. commitTransaction and abortTransaction are retryable write commands and MUST be retried according to the @@ -578,15 +578,14 @@ all preceding commands in the transaction. ### **Interaction with Client Backpressure** All commands in a transaction are subject to the -[Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly when -the appropriate error labels are added by the server. This includes the initial command with `startTransaction:true`, -the `abortTransaction` and `commitTransaction` commands, as well as any read or write commands attempted during the -transaction. +[Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly. +This includes the initial command with `startTransaction:true`, the `abortTransaction` and `commitTransaction` commands, +as well as any read or write commands attempted during the transaction. -If a command fails with backpressure labels and it includes `startTransaction:true`, the retried command MUST also -include `startTransaction:true`. +If a command fails with a retryable backpressure error and it includes `startTransaction:true`, the retried command MUST +also include `startTransaction:true`. -If a command fails backpressure retries `MAX_ATTEMPTS` times, it MUST not be retried again, including the +If a command fails backpressure retries `MAX_RETRIES` times, it MUST not be retried again, including the `commitTransaction` command. ### **Server Commands** @@ -1064,8 +1063,9 @@ transaction. Drivers SHOULD apply a majority write concern when retrying commitTransaction to guard against a transaction being applied twice. -Drivers SHOULD NOT modify the write concern on commit transaction commands when retrying a backpressure error, since we -are sure that the transaction has not been applied. +Drivers SHOULD NOT modify the write concern on commit transaction commands when retrying a retryable backpressure error. +A retryable backpressure error indicates no work was performed by the server, and the rationale outlined in this section +for using majority write concern on retries is therefore irrelevant. Consider the following scenario: From 629aec9f2626dbcd451893e4da2e5559b646f7f3 Mon Sep 17 00:00:00 2001 From: Bailey Pearson Date: Wed, 7 Jan 2026 13:54:15 -0700 Subject: [PATCH 41/55] Update source/client-backpressure/client-backpressure.md Co-authored-by: Ferdinando Papale <4850119+papafe@users.noreply.github.com> --- source/client-backpressure/client-backpressure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 540ecb1235..8fb38bb259 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -26,7 +26,7 @@ connection spikes from overloading the system. #### Ingress Request Rate Limiter -A token bucket based system introduced in MongoDB 8.2 to admit an command or reject it with a System Overload Error at +A token bucket based system introduced in MongoDB 8.2 to admit a command or reject it with a System Overload Error at the front door of a mongod/s. It aims to prevent command spikes from overloading the system. #### MongoTune From 9e4ad7093283a6e8b3978955b72ade7eb205229e Mon Sep 17 00:00:00 2001 From: Bailey Pearson Date: Wed, 7 Jan 2026 13:54:58 -0700 Subject: [PATCH 42/55] Update source/client-backpressure/client-backpressure.md Co-authored-by: Ferdinando Papale <4850119+papafe@users.noreply.github.com> --- source/client-backpressure/client-backpressure.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 8fb38bb259..1ec27afdeb 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -21,7 +21,7 @@ The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SH #### Ingress Connection Rate Limiter -A token-bucket based system introduced in MongoDB 8.2 to admit, reject or queue connection requests. It aims to prevent +A token bucket based system introduced in MongoDB 8.2 to admit, reject or queue connection requests. It aims to prevent connection spikes from overloading the system. #### Ingress Request Rate Limiter From d4d0b383f16be19053dedca5f93da615246852a2 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 8 Jan 2026 10:10:27 -0700 Subject: [PATCH 43/55] jermery's comments - formatting of yml files and prose test description --- source/client-backpressure/tests/README.md | 12 +- .../tests/backpressure-retry-loop.json | 76 ++-- .../tests/backpressure-retry-loop.yml | 347 ++++++------------ .../backpressure-retry-loop.yml.template | 41 +-- .../backpressure-retry-max-attempts.json | 64 ++-- .../tests/backpressure-retry-max-attempts.yml | 275 ++++---------- ...ckpressure-retry-max-attempts.yml.template | 30 +- .../tests/getMore-retried.json | 4 +- .../tests/getMore-retried.yml | 8 +- 9 files changed, 282 insertions(+), 575 deletions(-) diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md index 9f49b1ed20..5e9b611a99 100644 --- a/source/client-backpressure/tests/README.md +++ b/source/client-backpressure/tests/README.md @@ -50,8 +50,10 @@ Drivers should test that retries do not occur immediately when a SystemOverloade 5. Execute step 3 again. 6. Compare the two time between the two runs. - ```python - assertTrue(with_backoff_time - no_backoff_time >= 2.1) - ``` - The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two - runs. + + ```python + assertTrue(with_backoff_time - no_backoff_time >= 2.1) + ``` + + The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two + runs. diff --git a/source/client-backpressure/tests/backpressure-retry-loop.json b/source/client-backpressure/tests/backpressure-retry-loop.json index 2542344b38..8c01020ca5 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.json +++ b/source/client-backpressure/tests/backpressure-retry-loop.json @@ -96,8 +96,8 @@ } }, { - "object": "client", "name": "listDatabases", + "object": "client", "arguments": { "filter": {} } @@ -178,8 +178,8 @@ } }, { - "object": "client", - "name": "listDatabaseNames" + "name": "listDatabaseNames", + "object": "client" } ], "expectEvents": [ @@ -257,8 +257,8 @@ } }, { - "object": "client", "name": "createChangeStream", + "object": "client", "arguments": { "pipeline": [] } @@ -344,8 +344,8 @@ } }, { - "object": "client", "name": "clientBulkWrite", + "object": "client", "arguments": { "models": [ { @@ -436,8 +436,8 @@ } }, { - "object": "database", "name": "aggregate", + "object": "database", "arguments": { "pipeline": [ { @@ -525,8 +525,8 @@ } }, { - "object": "database", "name": "listCollections", + "object": "database", "arguments": { "filter": {} } @@ -607,8 +607,8 @@ } }, { - "object": "database", "name": "listCollectionNames", + "object": "database", "arguments": { "filter": {} } @@ -689,8 +689,8 @@ } }, { - "object": "database", "name": "runCommand", + "object": "database", "arguments": { "command": { "ping": 1 @@ -774,8 +774,8 @@ } }, { - "object": "database", "name": "createChangeStream", + "object": "database", "arguments": { "pipeline": [] } @@ -856,8 +856,8 @@ } }, { - "object": "collection", "name": "aggregate", + "object": "collection", "arguments": { "pipeline": [] } @@ -938,8 +938,8 @@ } }, { - "object": "collection", "name": "countDocuments", + "object": "collection", "arguments": { "filter": {} } @@ -1020,8 +1020,8 @@ } }, { - "object": "collection", - "name": "estimatedDocumentCount" + "name": "estimatedDocumentCount", + "object": "collection" } ], "expectEvents": [ @@ -1099,8 +1099,8 @@ } }, { - "object": "collection", "name": "distinct", + "object": "collection", "arguments": { "fieldName": "x", "filter": {} @@ -1182,8 +1182,8 @@ } }, { - "object": "collection", "name": "find", + "object": "collection", "arguments": { "filter": {} } @@ -1264,8 +1264,8 @@ } }, { - "object": "collection", "name": "findOne", + "object": "collection", "arguments": { "filter": {} } @@ -1346,8 +1346,8 @@ } }, { - "object": "collection", - "name": "listIndexes" + "name": "listIndexes", + "object": "collection" } ], "expectEvents": [ @@ -1425,8 +1425,8 @@ } }, { - "object": "collection", - "name": "listIndexNames" + "name": "listIndexNames", + "object": "collection" } ], "expectEvents": [ @@ -1504,8 +1504,8 @@ } }, { - "object": "collection", "name": "createChangeStream", + "object": "collection", "arguments": { "pipeline": [] } @@ -1586,8 +1586,8 @@ } }, { - "object": "collection", "name": "insertOne", + "object": "collection", "arguments": { "document": { "_id": 2, @@ -1671,8 +1671,8 @@ } }, { - "object": "collection", "name": "insertMany", + "object": "collection", "arguments": { "documents": [ { @@ -1758,8 +1758,8 @@ } }, { - "object": "collection", "name": "deleteOne", + "object": "collection", "arguments": { "filter": {} } @@ -1840,8 +1840,8 @@ } }, { - "object": "collection", "name": "deleteMany", + "object": "collection", "arguments": { "filter": {} } @@ -1922,8 +1922,8 @@ } }, { - "object": "collection", "name": "replaceOne", + "object": "collection", "arguments": { "filter": {}, "replacement": { @@ -2007,8 +2007,8 @@ } }, { - "object": "collection", "name": "updateOne", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -2094,8 +2094,8 @@ } }, { - "object": "collection", "name": "updateMany", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -2181,8 +2181,8 @@ } }, { - "object": "collection", "name": "findOneAndDelete", + "object": "collection", "arguments": { "filter": {} } @@ -2263,8 +2263,8 @@ } }, { - "object": "collection", "name": "findOneAndReplace", + "object": "collection", "arguments": { "filter": {}, "replacement": { @@ -2348,8 +2348,8 @@ } }, { - "object": "collection", "name": "findOneAndUpdate", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -2435,8 +2435,8 @@ } }, { - "object": "collection", "name": "bulkWrite", + "object": "collection", "arguments": { "requests": [ { @@ -2526,8 +2526,8 @@ } }, { - "object": "collection", "name": "createIndex", + "object": "collection", "arguments": { "keys": { "x": 11 @@ -2588,8 +2588,8 @@ "description": "collection.dropIndex retries using operation loop", "operations": [ { - "object": "retryable-writes-tests", "name": "createIndex", + "object": "retryable-writes-tests", "arguments": { "keys": { "x": 11 @@ -2621,8 +2621,8 @@ } }, { - "object": "collection", "name": "dropIndex", + "object": "collection", "arguments": { "name": "x_11" } @@ -2703,8 +2703,8 @@ } }, { - "object": "collection", - "name": "dropIndexes" + "name": "dropIndexes", + "object": "collection" } ], "expectEvents": [ diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml b/source/client-backpressure/tests/backpressure-retry-loop.yml index 6a3033989b..eda57d4979 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml @@ -5,42 +5,35 @@ description: tests that operations respect overload backoff retry loop schemaVersion: '1.3' runOnRequirements: - - - minServerVersion: '4.4' # failCommand + - minServerVersion: '4.4' # failCommand topologies: [replicaset, sharded, load-balanced] createEntities: - - - client: + - client: id: &client client useMultipleMongoses: false observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - - - client: + - client: id: &internal_client internal_client useMultipleMongoses: false - - - database: + - database: id: &internal_db internal_db client: *internal_client databaseName: &database_name retryable-writes-tests - - - collection: + - collection: id: &internal_collection retryable-writes-tests database: *internal_db collectionName: &collection_name coll - - - database: + - database: id: &database database client: *client databaseName: *database_name - - - collection: + - collection: id: &collection collection database: *database collectionName: *collection_name @@ -53,13 +46,9 @@ initialData: _yamlAnchors: bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll -tests: - - - - description: 'client.listDatabases retries using operation loop' - operations: - - +tests: + - description: 'client.listDatabases retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -72,9 +61,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listDatabases object: *client - name: listDatabases arguments: filter: {} @@ -98,11 +86,8 @@ tests: - commandSucceededEvent: commandName: listDatabases - - - description: 'client.listDatabaseNames retries using operation loop' - operations: - - + - description: 'client.listDatabaseNames retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -115,9 +100,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listDatabaseNames object: *client - name: listDatabaseNames expectEvents: - client: *client @@ -139,11 +123,8 @@ tests: - commandSucceededEvent: commandName: listDatabases - - - description: 'client.createChangeStream retries using operation loop' - operations: - - + - description: 'client.createChangeStream retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -156,9 +137,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *client - name: createChangeStream arguments: pipeline: [] @@ -182,13 +162,10 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'client.clientBulkWrite retries using operation loop' + - description: 'client.clientBulkWrite retries using operation loop' runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 - operations: - - + operations: - name: failPoint object: testRunner arguments: @@ -201,9 +178,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: clientBulkWrite object: *client - name: clientBulkWrite arguments: models: - insertOne: @@ -230,11 +206,8 @@ tests: - commandSucceededEvent: commandName: bulkWrite - - - description: 'database.aggregate retries using operation loop' - operations: - - + - description: 'database.aggregate retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -247,9 +220,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: aggregate object: *database - name: aggregate arguments: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] @@ -273,11 +245,8 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'database.listCollections retries using operation loop' - operations: - - + - description: 'database.listCollections retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -290,9 +259,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listCollections object: *database - name: listCollections arguments: filter: {} @@ -316,11 +284,8 @@ tests: - commandSucceededEvent: commandName: listCollections - - - description: 'database.listCollectionNames retries using operation loop' - operations: - - + - description: 'database.listCollectionNames retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -333,9 +298,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listCollectionNames object: *database - name: listCollectionNames arguments: filter: {} @@ -359,11 +323,8 @@ tests: - commandSucceededEvent: commandName: listCollections - - - description: 'database.runCommand retries using operation loop' - operations: - - + - description: 'database.runCommand retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -376,9 +337,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: runCommand object: *database - name: runCommand arguments: command: { ping: 1 } commandName: ping @@ -403,11 +363,8 @@ tests: - commandSucceededEvent: commandName: ping - - - description: 'database.createChangeStream retries using operation loop' - operations: - - + - description: 'database.createChangeStream retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -420,9 +377,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *database - name: createChangeStream arguments: pipeline: [] @@ -446,11 +402,8 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'collection.aggregate retries using operation loop' - operations: - - + - description: 'collection.aggregate retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -463,9 +416,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: aggregate object: *collection - name: aggregate arguments: pipeline: [] @@ -489,11 +441,8 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'collection.countDocuments retries using operation loop' - operations: - - + - description: 'collection.countDocuments retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -506,9 +455,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: countDocuments object: *collection - name: countDocuments arguments: filter: {} @@ -532,11 +480,8 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'collection.estimatedDocumentCount retries using operation loop' - operations: - - + - description: 'collection.estimatedDocumentCount retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -549,9 +494,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: estimatedDocumentCount object: *collection - name: estimatedDocumentCount expectEvents: - client: *client @@ -573,11 +517,8 @@ tests: - commandSucceededEvent: commandName: count - - - description: 'collection.distinct retries using operation loop' - operations: - - + - description: 'collection.distinct retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -590,9 +531,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: distinct object: *collection - name: distinct arguments: fieldName: x filter: {} @@ -617,11 +557,8 @@ tests: - commandSucceededEvent: commandName: distinct - - - description: 'collection.find retries using operation loop' - operations: - - + - description: 'collection.find retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -634,9 +571,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: find object: *collection - name: find arguments: filter: {} @@ -660,11 +596,8 @@ tests: - commandSucceededEvent: commandName: find - - - description: 'collection.findOne retries using operation loop' - operations: - - + - description: 'collection.findOne retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -677,9 +610,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOne object: *collection - name: findOne arguments: filter: {} @@ -703,11 +635,8 @@ tests: - commandSucceededEvent: commandName: find - - - description: 'collection.listIndexes retries using operation loop' - operations: - - + - description: 'collection.listIndexes retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -720,9 +649,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listIndexes object: *collection - name: listIndexes expectEvents: - client: *client @@ -744,11 +672,8 @@ tests: - commandSucceededEvent: commandName: listIndexes - - - description: 'collection.listIndexNames retries using operation loop' - operations: - - + - description: 'collection.listIndexNames retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -761,9 +686,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listIndexNames object: *collection - name: listIndexNames expectEvents: - client: *client @@ -785,11 +709,8 @@ tests: - commandSucceededEvent: commandName: listIndexes - - - description: 'collection.createChangeStream retries using operation loop' - operations: - - + - description: 'collection.createChangeStream retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -802,9 +723,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *collection - name: createChangeStream arguments: pipeline: [] @@ -828,11 +748,8 @@ tests: - commandSucceededEvent: commandName: aggregate - - - description: 'collection.insertOne retries using operation loop' - operations: - - + - description: 'collection.insertOne retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -845,9 +762,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: insertOne object: *collection - name: insertOne arguments: document: { _id: 2, x: 22 } @@ -871,11 +787,8 @@ tests: - commandSucceededEvent: commandName: insert - - - description: 'collection.insertMany retries using operation loop' - operations: - - + - description: 'collection.insertMany retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -888,9 +801,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: insertMany object: *collection - name: insertMany arguments: documents: - { _id: 2, x: 22 } @@ -915,11 +827,8 @@ tests: - commandSucceededEvent: commandName: insert - - - description: 'collection.deleteOne retries using operation loop' - operations: - - + - description: 'collection.deleteOne retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -932,9 +841,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: deleteOne object: *collection - name: deleteOne arguments: filter: {} @@ -958,11 +866,8 @@ tests: - commandSucceededEvent: commandName: delete - - - description: 'collection.deleteMany retries using operation loop' - operations: - - + - description: 'collection.deleteMany retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -975,9 +880,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: deleteMany object: *collection - name: deleteMany arguments: filter: {} @@ -1001,11 +905,8 @@ tests: - commandSucceededEvent: commandName: delete - - - description: 'collection.replaceOne retries using operation loop' - operations: - - + - description: 'collection.replaceOne retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1018,9 +919,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: replaceOne object: *collection - name: replaceOne arguments: filter: {} replacement: { x: 22 } @@ -1045,11 +945,8 @@ tests: - commandSucceededEvent: commandName: update - - - description: 'collection.updateOne retries using operation loop' - operations: - - + - description: 'collection.updateOne retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1062,9 +959,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: updateOne object: *collection - name: updateOne arguments: filter: {} update: { $set: { x: 22 } } @@ -1089,11 +985,8 @@ tests: - commandSucceededEvent: commandName: update - - - description: 'collection.updateMany retries using operation loop' - operations: - - + - description: 'collection.updateMany retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1106,9 +999,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: updateMany object: *collection - name: updateMany arguments: filter: {} update: { $set: { x: 22 } } @@ -1133,11 +1025,8 @@ tests: - commandSucceededEvent: commandName: update - - - description: 'collection.findOneAndDelete retries using operation loop' - operations: - - + - description: 'collection.findOneAndDelete retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1150,9 +1039,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndDelete object: *collection - name: findOneAndDelete arguments: filter: {} @@ -1176,11 +1064,8 @@ tests: - commandSucceededEvent: commandName: findAndModify - - - description: 'collection.findOneAndReplace retries using operation loop' - operations: - - + - description: 'collection.findOneAndReplace retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1193,9 +1078,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndReplace object: *collection - name: findOneAndReplace arguments: filter: {} replacement: { x: 22 } @@ -1220,11 +1104,8 @@ tests: - commandSucceededEvent: commandName: findAndModify - - - description: 'collection.findOneAndUpdate retries using operation loop' - operations: - - + - description: 'collection.findOneAndUpdate retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1237,9 +1118,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndUpdate object: *collection - name: findOneAndUpdate arguments: filter: {} update: { $set: { x: 22 } } @@ -1264,11 +1144,8 @@ tests: - commandSucceededEvent: commandName: findAndModify - - - description: 'collection.bulkWrite retries using operation loop' - operations: - - + - description: 'collection.bulkWrite retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1281,9 +1158,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: bulkWrite object: *collection - name: bulkWrite arguments: requests: - insertOne: @@ -1309,11 +1185,8 @@ tests: - commandSucceededEvent: commandName: insert - - - description: 'collection.createIndex retries using operation loop' - operations: - - + - description: 'collection.createIndex retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1326,9 +1199,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createIndex object: *collection - name: createIndex arguments: keys: { x: 11 } name: "x_11" @@ -1353,17 +1225,13 @@ tests: - commandSucceededEvent: commandName: createIndexes - - - description: 'collection.dropIndex retries using operation loop' + - description: 'collection.dropIndex retries using operation loop' operations: - - + - name: createIndex object: *internal_collection - name: createIndex arguments: keys: { x: 11 } - name: "x_11" - - + name: "x_11" - name: failPoint object: testRunner arguments: @@ -1376,9 +1244,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: dropIndex object: *collection - name: dropIndex arguments: name: "x_11" @@ -1402,11 +1269,8 @@ tests: - commandSucceededEvent: commandName: dropIndexes - - - description: 'collection.dropIndexes retries using operation loop' - operations: - - + - description: 'collection.dropIndexes retries using operation loop' + operations: - name: failPoint object: testRunner arguments: @@ -1419,9 +1283,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: dropIndexes object: *collection - name: dropIndexes expectEvents: - client: *client diff --git a/source/client-backpressure/tests/backpressure-retry-loop.yml.template b/source/client-backpressure/tests/backpressure-retry-loop.yml.template index df1afe95cf..bd64827d4e 100644 --- a/source/client-backpressure/tests/backpressure-retry-loop.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-loop.yml.template @@ -5,42 +5,35 @@ description: tests that operations respect overload backoff retry loop schemaVersion: '1.3' runOnRequirements: - - - minServerVersion: '4.4' # failCommand + - minServerVersion: '4.4' # failCommand topologies: [replicaset, sharded, load-balanced] createEntities: - - - client: + - client: id: &client client useMultipleMongoses: false observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - - - client: + - client: id: &internal_client internal_client useMultipleMongoses: false - - - database: + - database: id: &internal_db internal_db client: *internal_client databaseName: &database_name retryable-writes-tests - - - collection: + - collection: id: &internal_collection retryable-writes-tests database: *internal_db collectionName: &collection_name coll - - - database: + - database: id: &database database client: *client databaseName: *database_name - - - collection: + - collection: id: &collection collection database: *database collectionName: *collection_name @@ -53,25 +46,18 @@ initialData: _yamlAnchors: bulWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll -tests: -{% for operation in operations %} - - - description: '{{operation.object}}.{{operation.operation_name}} retries using operation loop' - {%- if ((operation.operation_name == 'clientBulkWrite')) %} +tests: {% for operation in operations %} + - description: '{{operation.object}}.{{operation.operation_name}} retries using operation loop' {%- if ((operation.operation_name == 'clientBulkWrite')) %} runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} - operations: - {%- if operation.operation_name == "dropIndex" %} - - + operations: {%- if operation.operation_name == "dropIndex" %} + - name: createIndex object: *internal_collection - name: createIndex arguments: keys: { x: 11 } name: "x_11" - {%- endif %} - - + {%- endif %} - name: failPoint object: testRunner arguments: @@ -84,9 +70,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: {{operation.operation_name}} object: *{{operation.object}} - name: {{operation.operation_name}} {%- if operation.arguments|length > 0 %} arguments: {%- for arg in operation.arguments %} diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.json b/source/client-backpressure/tests/backpressure-retry-max-attempts.json index 1de8cb38d4..efde542621 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.json +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.json @@ -89,8 +89,8 @@ } }, { - "object": "client", "name": "listDatabases", + "object": "client", "arguments": { "filter": {} }, @@ -193,8 +193,8 @@ } }, { - "object": "client", "name": "listDatabaseNames", + "object": "client", "expectError": { "isError": true, "isClientError": false @@ -294,8 +294,8 @@ } }, { - "object": "client", "name": "createChangeStream", + "object": "client", "arguments": { "pipeline": [] }, @@ -403,8 +403,8 @@ } }, { - "object": "client", "name": "clientBulkWrite", + "object": "client", "arguments": { "models": [ { @@ -517,8 +517,8 @@ } }, { - "object": "database", "name": "aggregate", + "object": "database", "arguments": { "pipeline": [ { @@ -628,8 +628,8 @@ } }, { - "object": "database", "name": "listCollections", + "object": "database", "arguments": { "filter": {} }, @@ -732,8 +732,8 @@ } }, { - "object": "database", "name": "listCollectionNames", + "object": "database", "arguments": { "filter": {} }, @@ -836,8 +836,8 @@ } }, { - "object": "database", "name": "runCommand", + "object": "database", "arguments": { "command": { "ping": 1 @@ -943,8 +943,8 @@ } }, { - "object": "database", "name": "createChangeStream", + "object": "database", "arguments": { "pipeline": [] }, @@ -1047,8 +1047,8 @@ } }, { - "object": "collection", "name": "aggregate", + "object": "collection", "arguments": { "pipeline": [] }, @@ -1151,8 +1151,8 @@ } }, { - "object": "collection", "name": "countDocuments", + "object": "collection", "arguments": { "filter": {} }, @@ -1255,8 +1255,8 @@ } }, { - "object": "collection", "name": "estimatedDocumentCount", + "object": "collection", "expectError": { "isError": true, "isClientError": false @@ -1356,8 +1356,8 @@ } }, { - "object": "collection", "name": "distinct", + "object": "collection", "arguments": { "fieldName": "x", "filter": {} @@ -1461,8 +1461,8 @@ } }, { - "object": "collection", "name": "find", + "object": "collection", "arguments": { "filter": {} }, @@ -1565,8 +1565,8 @@ } }, { - "object": "collection", "name": "findOne", + "object": "collection", "arguments": { "filter": {} }, @@ -1669,8 +1669,8 @@ } }, { - "object": "collection", "name": "listIndexes", + "object": "collection", "expectError": { "isError": true, "isClientError": false @@ -1770,8 +1770,8 @@ } }, { - "object": "collection", "name": "listIndexNames", + "object": "collection", "expectError": { "isError": true, "isClientError": false @@ -1871,8 +1871,8 @@ } }, { - "object": "collection", "name": "createChangeStream", + "object": "collection", "arguments": { "pipeline": [] }, @@ -1975,8 +1975,8 @@ } }, { - "object": "collection", "name": "insertOne", + "object": "collection", "arguments": { "document": { "_id": 2, @@ -2082,8 +2082,8 @@ } }, { - "object": "collection", "name": "insertMany", + "object": "collection", "arguments": { "documents": [ { @@ -2191,8 +2191,8 @@ } }, { - "object": "collection", "name": "deleteOne", + "object": "collection", "arguments": { "filter": {} }, @@ -2295,8 +2295,8 @@ } }, { - "object": "collection", "name": "deleteMany", + "object": "collection", "arguments": { "filter": {} }, @@ -2399,8 +2399,8 @@ } }, { - "object": "collection", "name": "replaceOne", + "object": "collection", "arguments": { "filter": {}, "replacement": { @@ -2506,8 +2506,8 @@ } }, { - "object": "collection", "name": "updateOne", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -2615,8 +2615,8 @@ } }, { - "object": "collection", "name": "updateMany", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -2724,8 +2724,8 @@ } }, { - "object": "collection", "name": "findOneAndDelete", + "object": "collection", "arguments": { "filter": {} }, @@ -2828,8 +2828,8 @@ } }, { - "object": "collection", "name": "findOneAndReplace", + "object": "collection", "arguments": { "filter": {}, "replacement": { @@ -2935,8 +2935,8 @@ } }, { - "object": "collection", "name": "findOneAndUpdate", + "object": "collection", "arguments": { "filter": {}, "update": { @@ -3044,8 +3044,8 @@ } }, { - "object": "collection", "name": "bulkWrite", + "object": "collection", "arguments": { "requests": [ { @@ -3157,8 +3157,8 @@ } }, { - "object": "collection", "name": "createIndex", + "object": "collection", "arguments": { "keys": { "x": 11 @@ -3264,8 +3264,8 @@ } }, { - "object": "collection", "name": "dropIndex", + "object": "collection", "arguments": { "name": "x_11" }, @@ -3368,8 +3368,8 @@ } }, { - "object": "collection", "name": "dropIndexes", + "object": "collection", "expectError": { "isError": true, "isClientError": false diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml index 3800b20a33..39e4859368 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml @@ -10,24 +10,21 @@ runOnRequirements: topologies: [replicaset, sharded, load-balanced] createEntities: - - - client: + - client: id: &client client useMultipleMongoses: false observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - - - client: + - client: id: &fail_point_client fail_point_client useMultipleMongoses: false - - - database: + - database: id: &database database client: *client databaseName: &database_name retryable-writes-tests - - - collection: + + - collection: id: &collection collection database: *database collectionName: &collection_name coll @@ -36,18 +33,14 @@ _yamlAnchors: bulkWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll initialData: - - - collectionName: *collection_name + - collectionName: *collection_name databaseName: *database_name documents: - { _id: 1, x: 11 } - { _id: 2, x: 22 } -tests: - - - - description: 'client.listDatabases retries at most maxAttempts=5 times' - +tests: + - description: 'client.listDatabases retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -61,9 +54,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listDatabases object: *client - name: listDatabases arguments: filter: {} expectError: @@ -100,10 +92,7 @@ tests: - commandFailedEvent: commandName: listDatabases - - - - description: 'client.listDatabaseNames retries at most maxAttempts=5 times' - + - description: 'client.listDatabaseNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -117,9 +106,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listDatabaseNames object: *client - name: listDatabaseNames expectError: isError: true isClientError: false @@ -154,10 +142,7 @@ tests: - commandFailedEvent: commandName: listDatabases - - - - description: 'client.createChangeStream retries at most maxAttempts=5 times' - + - description: 'client.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -171,9 +156,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *client - name: createChangeStream arguments: pipeline: [] expectError: @@ -210,12 +194,9 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'client.clientBulkWrite retries at most maxAttempts=5 times' + - description: 'client.clientBulkWrite retries at most maxAttempts=5 times' runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 - operations: - name: failPoint object: testRunner @@ -229,9 +210,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: clientBulkWrite object: *client - name: clientBulkWrite arguments: models: - insertOne: @@ -271,10 +251,7 @@ tests: - commandFailedEvent: commandName: bulkWrite - - - - description: 'database.aggregate retries at most maxAttempts=5 times' - + - description: 'database.aggregate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -288,9 +265,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: aggregate object: *database - name: aggregate arguments: pipeline: [ { $listLocalSessions: {} }, { $limit: 1 } ] expectError: @@ -327,10 +303,7 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'database.listCollections retries at most maxAttempts=5 times' - + - description: 'database.listCollections retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -344,9 +317,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listCollections object: *database - name: listCollections arguments: filter: {} expectError: @@ -383,10 +355,7 @@ tests: - commandFailedEvent: commandName: listCollections - - - - description: 'database.listCollectionNames retries at most maxAttempts=5 times' - + - description: 'database.listCollectionNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -400,9 +369,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listCollectionNames object: *database - name: listCollectionNames arguments: filter: {} expectError: @@ -439,10 +407,7 @@ tests: - commandFailedEvent: commandName: listCollections - - - - description: 'database.runCommand retries at most maxAttempts=5 times' - + - description: 'database.runCommand retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -456,9 +421,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: runCommand object: *database - name: runCommand arguments: command: { ping: 1 } commandName: ping @@ -496,10 +460,7 @@ tests: - commandFailedEvent: commandName: ping - - - - description: 'database.createChangeStream retries at most maxAttempts=5 times' - + - description: 'database.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -513,9 +474,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *database - name: createChangeStream arguments: pipeline: [] expectError: @@ -552,10 +512,7 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'collection.aggregate retries at most maxAttempts=5 times' - + - description: 'collection.aggregate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -569,9 +526,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: aggregate object: *collection - name: aggregate arguments: pipeline: [] expectError: @@ -608,10 +564,7 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'collection.countDocuments retries at most maxAttempts=5 times' - + - description: 'collection.countDocuments retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -625,9 +578,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: countDocuments object: *collection - name: countDocuments arguments: filter: {} expectError: @@ -664,10 +616,7 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'collection.estimatedDocumentCount retries at most maxAttempts=5 times' - + - description: 'collection.estimatedDocumentCount retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -681,9 +630,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: estimatedDocumentCount object: *collection - name: estimatedDocumentCount expectError: isError: true isClientError: false @@ -718,10 +666,7 @@ tests: - commandFailedEvent: commandName: count - - - - description: 'collection.distinct retries at most maxAttempts=5 times' - + - description: 'collection.distinct retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -735,9 +680,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: distinct object: *collection - name: distinct arguments: fieldName: x filter: {} @@ -775,10 +719,7 @@ tests: - commandFailedEvent: commandName: distinct - - - - description: 'collection.find retries at most maxAttempts=5 times' - + - description: 'collection.find retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -792,9 +733,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: find object: *collection - name: find arguments: filter: {} expectError: @@ -831,10 +771,7 @@ tests: - commandFailedEvent: commandName: find - - - - description: 'collection.findOne retries at most maxAttempts=5 times' - + - description: 'collection.findOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -848,9 +785,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOne object: *collection - name: findOne arguments: filter: {} expectError: @@ -887,10 +823,7 @@ tests: - commandFailedEvent: commandName: find - - - - description: 'collection.listIndexes retries at most maxAttempts=5 times' - + - description: 'collection.listIndexes retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -904,9 +837,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listIndexes object: *collection - name: listIndexes expectError: isError: true isClientError: false @@ -941,10 +873,7 @@ tests: - commandFailedEvent: commandName: listIndexes - - - - description: 'collection.listIndexNames retries at most maxAttempts=5 times' - + - description: 'collection.listIndexNames retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -958,9 +887,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: listIndexNames object: *collection - name: listIndexNames expectError: isError: true isClientError: false @@ -995,10 +923,7 @@ tests: - commandFailedEvent: commandName: listIndexes - - - - description: 'collection.createChangeStream retries at most maxAttempts=5 times' - + - description: 'collection.createChangeStream retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1012,9 +937,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createChangeStream object: *collection - name: createChangeStream arguments: pipeline: [] expectError: @@ -1051,10 +975,7 @@ tests: - commandFailedEvent: commandName: aggregate - - - - description: 'collection.insertOne retries at most maxAttempts=5 times' - + - description: 'collection.insertOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1068,9 +989,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: insertOne object: *collection - name: insertOne arguments: document: { _id: 2, x: 22 } expectError: @@ -1107,10 +1027,7 @@ tests: - commandFailedEvent: commandName: insert - - - - description: 'collection.insertMany retries at most maxAttempts=5 times' - + - description: 'collection.insertMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1124,9 +1041,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: insertMany object: *collection - name: insertMany arguments: documents: - { _id: 2, x: 22 } @@ -1164,10 +1080,7 @@ tests: - commandFailedEvent: commandName: insert - - - - description: 'collection.deleteOne retries at most maxAttempts=5 times' - + - description: 'collection.deleteOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1181,9 +1094,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: deleteOne object: *collection - name: deleteOne arguments: filter: {} expectError: @@ -1220,10 +1132,7 @@ tests: - commandFailedEvent: commandName: delete - - - - description: 'collection.deleteMany retries at most maxAttempts=5 times' - + - description: 'collection.deleteMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1237,9 +1146,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: deleteMany object: *collection - name: deleteMany arguments: filter: {} expectError: @@ -1276,10 +1184,7 @@ tests: - commandFailedEvent: commandName: delete - - - - description: 'collection.replaceOne retries at most maxAttempts=5 times' - + - description: 'collection.replaceOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1293,9 +1198,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: replaceOne object: *collection - name: replaceOne arguments: filter: {} replacement: { x: 22 } @@ -1333,10 +1237,7 @@ tests: - commandFailedEvent: commandName: update - - - - description: 'collection.updateOne retries at most maxAttempts=5 times' - + - description: 'collection.updateOne retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1350,9 +1251,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: updateOne object: *collection - name: updateOne arguments: filter: {} update: { $set: { x: 22 } } @@ -1390,10 +1290,7 @@ tests: - commandFailedEvent: commandName: update - - - - description: 'collection.updateMany retries at most maxAttempts=5 times' - + - description: 'collection.updateMany retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1407,9 +1304,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: updateMany object: *collection - name: updateMany arguments: filter: {} update: { $set: { x: 22 } } @@ -1447,10 +1343,7 @@ tests: - commandFailedEvent: commandName: update - - - - description: 'collection.findOneAndDelete retries at most maxAttempts=5 times' - + - description: 'collection.findOneAndDelete retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1464,9 +1357,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndDelete object: *collection - name: findOneAndDelete arguments: filter: {} expectError: @@ -1503,10 +1395,7 @@ tests: - commandFailedEvent: commandName: findAndModify - - - - description: 'collection.findOneAndReplace retries at most maxAttempts=5 times' - + - description: 'collection.findOneAndReplace retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1520,9 +1409,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndReplace object: *collection - name: findOneAndReplace arguments: filter: {} replacement: { x: 22 } @@ -1560,10 +1448,7 @@ tests: - commandFailedEvent: commandName: findAndModify - - - - description: 'collection.findOneAndUpdate retries at most maxAttempts=5 times' - + - description: 'collection.findOneAndUpdate retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1577,9 +1462,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: findOneAndUpdate object: *collection - name: findOneAndUpdate arguments: filter: {} update: { $set: { x: 22 } } @@ -1617,10 +1501,7 @@ tests: - commandFailedEvent: commandName: findAndModify - - - - description: 'collection.bulkWrite retries at most maxAttempts=5 times' - + - description: 'collection.bulkWrite retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1634,9 +1515,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: bulkWrite object: *collection - name: bulkWrite arguments: requests: - insertOne: @@ -1675,10 +1555,7 @@ tests: - commandFailedEvent: commandName: insert - - - - description: 'collection.createIndex retries at most maxAttempts=5 times' - + - description: 'collection.createIndex retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1692,9 +1569,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: createIndex object: *collection - name: createIndex arguments: keys: { x: 11 } name: "x_11" @@ -1732,10 +1608,7 @@ tests: - commandFailedEvent: commandName: createIndexes - - - - description: 'collection.dropIndex retries at most maxAttempts=5 times' - + - description: 'collection.dropIndex retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1749,9 +1622,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: dropIndex object: *collection - name: dropIndex arguments: name: "x_11" expectError: @@ -1788,10 +1660,7 @@ tests: - commandFailedEvent: commandName: dropIndexes - - - - description: 'collection.dropIndexes retries at most maxAttempts=5 times' - + - description: 'collection.dropIndexes retries at most maxAttempts=5 times' operations: - name: failPoint object: testRunner @@ -1805,9 +1674,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - + - name: dropIndexes object: *collection - name: dropIndexes expectError: isError: true isClientError: false @@ -1841,4 +1709,3 @@ tests: commandName: dropIndexes - commandFailedEvent: commandName: dropIndexes - diff --git a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template index 3117d44b89..3a9cb27ab5 100644 --- a/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template +++ b/source/client-backpressure/tests/backpressure-retry-max-attempts.yml.template @@ -10,24 +10,21 @@ runOnRequirements: topologies: [replicaset, sharded, load-balanced] createEntities: - - - client: + - client: id: &client client useMultipleMongoses: false observeEvents: [commandStartedEvent, commandSucceededEvent, commandFailedEvent] - - - client: + - client: id: &fail_point_client fail_point_client useMultipleMongoses: false - - - database: + - database: id: &database database client: *client databaseName: &database_name retryable-writes-tests - - - collection: + + - collection: id: &collection collection database: *database collectionName: &collection_name coll @@ -36,22 +33,18 @@ _yamlAnchors: bulkWriteInsertNamespace: &client_bulk_write_ns retryable-writes-tests.coll initialData: - - - collectionName: *collection_name + - collectionName: *collection_name databaseName: *database_name documents: - { _id: 1, x: 11 } - { _id: 2, x: 22 } -tests: -{% for operation in operations %} - - - description: '{{operation.object}}.{{operation.operation_name}} retries at most maxAttempts=5 times' +tests: {% for operation in operations %} + - description: '{{operation.object}}.{{operation.operation_name}} retries at most maxAttempts=5 times' {%- if ((operation.operation_name == 'clientBulkWrite')) %} runOnRequirements: - minServerVersion: '8.0' # client bulk write added to server in 8.0 {%- endif %} - operations: - name: failPoint object: testRunner @@ -65,10 +58,8 @@ tests: errorLabels: [RetryableError, SystemOverloadedError] errorCode: 2 - - - object: *{{operation.object}} - name: {{operation.operation_name}} - {%- if operation.arguments|length > 0 %} + - name: {{operation.operation_name}} + object: *{{operation.object}} {%- if operation.arguments|length > 0 %} arguments: {%- for arg in operation.arguments %} {{arg}} @@ -107,5 +98,4 @@ tests: commandName: {{operation.command_name}} - commandFailedEvent: commandName: {{operation.command_name}} - {% endfor -%} diff --git a/source/client-backpressure/tests/getMore-retried.json b/source/client-backpressure/tests/getMore-retried.json index 70eff84612..fed6ab8cb9 100644 --- a/source/client-backpressure/tests/getMore-retried.json +++ b/source/client-backpressure/tests/getMore-retried.json @@ -1,5 +1,5 @@ { - "description": "getMore-retries-backpressure", + "description": "getMore-retried-backpressure", "schemaVersion": "1.3", "runOnRequirements": [ { @@ -83,6 +83,7 @@ }, { "name": "find", + "object": "coll", "arguments": { "batchSize": 2, "filter": {}, @@ -90,7 +91,6 @@ "a": 1 } }, - "object": "coll", "expectResult": [ { "a": 1 diff --git a/source/client-backpressure/tests/getMore-retried.yml b/source/client-backpressure/tests/getMore-retried.yml index 3a5d180aa5..45f5ad9deb 100644 --- a/source/client-backpressure/tests/getMore-retried.yml +++ b/source/client-backpressure/tests/getMore-retried.yml @@ -1,8 +1,7 @@ -description: getMore-retries-backpressure +description: getMore-retried-backpressure schemaVersion: "1.3" runOnRequirements: - - - minServerVersion: '4.4' # failCommand + - minServerVersion: '4.4' # failCommand createEntities: - client: id: &client client0 @@ -44,17 +43,18 @@ tests: errorCode: 2 - name: find + object: *collection arguments: # batch size of 2 with 3 docs in the collection ensures exactly one find + one getMore exhaust the cursor batchSize: 2 filter: {} # ensure stable ordering of result documents sort: { a: 1 } - object: *collection expectResult: - { a: 1 } - { a: 2 } - { a: 3 } + expectEvents: - client: *client events: From 530e727dd5cc0d0eb2606ea6db1cf144968597e7 Mon Sep 17 00:00:00 2001 From: bailey Date: Fri, 9 Jan 2026 15:04:24 -0700 Subject: [PATCH 44/55] Valentin's comments --- .../client-backpressure.md | 96 ++++++++++--------- source/transactions/transactions.md | 19 ++-- 2 files changed, 61 insertions(+), 54 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 1ec27afdeb..7c531b8d08 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -37,12 +37,12 @@ the connection and request rate limiters to prevent and mitigate overloading the #### RetryableError label -This error label indicates that an command is safely retryable regardless of the command type (read or write), its +This error label indicates that a command is safely retryable regardless of the command type (read or write), its metadata, or any of its arguments. #### SystemOverloadedError label -An error is considered overloaded if it includes the "SystemOverloadError" label. This error label indicates that the +An error is considered overloaded if it includes the `SystemOverloadedError` label. This error label indicates that the server is overloaded. If this error label is present, drivers will backoff before attempting a retry. #### Overload Errors @@ -60,15 +60,8 @@ exceeds the ingress request rate limit: } ``` -When a new connection attempt exceeds the ingress connection rate limit, the server closes the TCP connection before TLS -handshake is complete. Drivers will observe this as a network error (e.g. "connection reset by peer" or "connection -closed"). - -When a new connection attempt is queued by the server for so long that the driver-side timeout expires, drivers will -observe this as a network timeout error. - -Note that there is no guarantee that all SystemOverloaded errors are retryable or that all RetryableErrors also have the -SystemOverloaded error label. +Note that there is no guarantee that all errors with the `SystemOverloadedError` label can be retried nor that all +errors with the `RetryableError` label also have the `SystemOverloadedError` error label. #### Goodput @@ -82,55 +75,65 @@ See [goodput](https://en.wikipedia.org/wiki/Goodput). #### Overload retry policy This specification expands the driver's retry ability to all commands if the error indicates that it is both an overload -error and that it is retryable, including those not currently considered retryable such as updateMany, create -collection, getMore, and generic runCommand. The new command execution method obeys the following rules: +error and that it is retryable, including those not eligible for retry under the read/write retry policies such as +updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys the following +rules: 1. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. - The value is 0.1 and non-configurable. 2. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. 3. If a retry attempt fails with an error that does not include `SystemOverloadedError` label, drivers MUST deposit 1 token. - - A non-SystemOverloaded error indicates that the server is healthy enough to handle requests. For the purposes of - retry budget tracking, this counts as a success. + - An error without the `SystemOverloaded` error label indicates that the server is healthy enough to handle requests. + For the purposes of retry budget tracking, this counts as a success. 4. A retry attempt will only be permitted if: - 1. The error has both the `SystemOverloadedError` and the `RetryableError` label. + 1. The error has both the `SystemOverloadedError` and the `RetryableError` error labels. 2. We have not reached `MAX_RETRIES`. - The value of `MAX_RETRIES` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within the timeout to avoid retry storms. - 3. (CSOT-only): `timeoutMS` has not expired. - 4. A token can be acquired from the token bucket. + 3. (CSOT-only): There is still time for a retry attempt according to the + [Client Side Operations Timeout](../client-side-operations-timeout/client-side-operations-timeout.md) + specification. + 4. A token can be consumed from the token bucket. 5. A retry attempt consumes 1 token from the token bucket. 6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to - the following formula: `delayMS = j * min(maxBackoff, baseBackoff * 2^(i - 1))` - - `i` is the retry attempt number (starting with 1 for the first retry). Note that `i` includes retries for - non-overloaded errors. - - `j` is a random jitter value between 0 and 1. - - `baseBackoff` is constant 100ms. - - `maxBackoff` is 10000ms. + the following formula: `backoff = j * min(maxBackoff, baseBackoff * 2^(i - 1))` + - `i` is the retry attempt number (starting with 1 for the first retry). Note that `i` includes retries for errors + without the `SystemOverloaded` error label. + - `jitter` is a random jitter value between 0 and 1. + - `BASE_BACKOFF` is constant 100ms. + - `MAX_BACKOFF` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. 7. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's address to the list of deprioritized server addresses for server selection. +8. If the request is eligible for retry (as outlined in step 4) and is a retryable write: + 1. If the command is a part of a transaction, it MUST NOT be modified for retry, as outlined in the + [transactions](../transactions/transactions.md#interaction-with-retryable-writes). + 2. If the command is a not a part of a transaction, it MUST be modified for retry, as outlined + [retryable writes](../retryable-writes/retryable-writes.md) specifications. +9. If the request is not eligible for any retries, then the client MUST propagate the most recently encountered error + that does not have the NoWritesPerformed error label. -#### Interaction with Existing Retry Behavior +#### Interaction with Other Retry Policies -The retry policy in this specification is separate from the existing retryability policies defined in the +The retry policy in this specification is separate from the other retry policies defined in the [retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) specifications. Drivers MUST ensure: -- Only errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. -- Only errors with the `SystemOverloadedError` label apply backoff. +- When a failed attempt is retried, backoff must be applied if and only if the attempt error has the + `SystemOverloadedError` label. - All retryable errors apply backoff if they also contain a `SystemOverloadedError` label. This includes: - Errors defined as retryable in the [retryable reads specification](../retryable-reads/retryable-reads.md). - Errors defined as retryable in the [retryable writes specification](../retryable-writes/retryable-writes.md). - Errors with the `RetryableError` label. -- Any command is retried at most MAX_ATTEMPTS (default=5) times, if any attempt has failed with a +- Any command may be retried at most MAX_RETRIES (default=5) times, if any attempt has failed with a `SystemOverloadedError`, regardless of which retry policy the current or future retry attempts are caused by. #### Pseudocode -The following pseudocode demonstrates the unified retry behavior, combining the overload retry policy defined in this -specification with the existing retry behaviors from [Retryable Reads](../retryable-reads/retryable-reads.md) and +The following pseudocode demonstrates the unified retry policy, combining the overload retry policy defined in this +specification with the existing retry policies from [Retryable Reads](../retryable-reads/retryable-reads.md) and [Retryable Writes](../retryable-writes/retryable-writes.md). For brevity, some error handling details such as the handling of "NoWritesPerformed" are omitted. @@ -163,7 +166,7 @@ def execute_command_retryable(command, ...): is_retryable = is_retryable_write() or is_retryable_read() or (exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError")) is_overload = exc.has_error_label("SystemOverloadedError") - # if a retry fails with a non-System overloaded error, deposit 1 token + # if a retry fails with an error which is not an overload error, deposit 1 token if attempt > 0 and not is_overload: token_bucket.deposit(1) @@ -197,7 +200,7 @@ def execute_command_retryable(command, ...): ### Token Bucket -The overload retry policy introduces a per-client token bucket to limit SystemOverloaded retry attempts. Although the +The overload retry policy introduces a per-client token bucket to limit overload error retry attempts. Although the server rejects excess commands as quickly as possible, doing so costs CPU and creates extra contention on the connection pool which can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry attempts during a prolonged overload. @@ -282,8 +285,8 @@ number of errors users see during spikes or burst workloads and help prevent ret However, older drivers do not have this benefit. Drivers MUST document that: - Users SHOULD upgrade to driver versions that officially support backpressure to avoid any impacts of server changes. -- Users who do not upgrade might need to update application error handling to handle higher error rates of - SystemOverloadedErrors. +- Users who do not upgrade might need to update application error handling to handle higher error rates of overload + errors. ## Test Plan @@ -296,9 +299,9 @@ extreme load, however clients do not know how to handle the errors returned when As a result, such overload errors would currently either be propagated back to applications, increasing externally-visible command failure rates, or be retried immediately, increasing the load on already overburdened servers. To minimize these effects, this specification enables clients to retry requests that have been load shed in a -way that does not overburden already overloaded servers. This retry behavior allows for more aggressive and effective -load shedding policies to be deployed in the future. This will also help unify the currently-divergent retry behavior -between drivers and the server (mongos). +way that does not overburden already overloaded servers. This retry policy allows for more aggressive and effective load +shedding policies to be deployed in the future. This will also help unify the currently-divergent retry policy between +drivers and the server (mongos). ## Reference Implementation @@ -326,17 +329,18 @@ for logical reasons. So, when determining the number of retries an operation sho - Any load-shedded errors should be retried to give them a real attempt at success - If the command ultimately would have failed if it had not been load shed by the server, returning an actionable error - message is preferable to a generic SystemOverloadedError. + message is preferable to a generic overload error. The maximum retry attempt logic in this specification balances legacy retryability behavior with load-shedding behavior: -- Relying on either 1 or infinite timeouts (depending on CSOT) preserves existing retry behavior. -- Adjusting the maximum number of retry attempts to 5 if a `SystemOverloadedError` error is returned from the server - gives requests more opportunities to succeed and helps reduce application errors. -- An alternative approach would be to retry once if we don't receive a SystemOverloadedError, in which case we'd retry 5 - times. The approach chosen allows for additional retries in scenarios where a non-`SystemOverloadedError` fails on a - retry with a `SystemOverloadedError`. +- Relying on either 1 or infinite retries (depending on whether CSOT enabled or not) preserves existing retry behavior + when no overload errors are encountered. +- Adjusting the maximum number of retry attempts to 5 if an overload error is returned from the server gives requests + more opportunities to succeed and helps reduce application errors. +- An alternative approach would be to retry once if we don't receive an overload error, in which case we'd retry 5 + times. The approach chosen allows for additional retries in scenarios where a non-overload error fails on a retry + with an overload error. ## Changelog -- 2025-XX-XX: Initial version. +- 2026-01-09: Initial version. diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 70922bd243..1ad0aad4b2 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -350,7 +350,8 @@ transaction, drivers MUST NOT run the commitTransaction command. commitTransaction is a retryable write command. Drivers MUST retry once after commitTransaction fails with a retryable error, including a handshake network error, according to the Retryable Writes Specification, regardless of whether -retryWrites is set on the MongoClient or not. +retryWrites is set on the MongoClient or not. If a commitTransaction fails with a retryable backpressure error, the +command MUST be retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). When commitTransaction is retried, either by the driver's internal retry logic or explicitly by the user calling commitTransaction again, drivers MUST apply `w: majority` to the write concern of the commitTransaction command. If the @@ -390,10 +391,11 @@ abortTransaction is a retryable write command. Drivers MUST retry after abortTra according to the [Retryable Writes Specification](../retryable-writes/retryable-writes.md), including a handshake network error, regardless of whether retryWrites is set on the MongoClient or not. -If the operation times out or fails with a non-retryable error, drivers MUST ignore all errors from the abortTransaction -command. Errors from abortTransaction are meaningless to the application because they cannot do anything to recover from -the error. The transaction will ultimately be aborted by the server anyway either upon reaching an age limit or when the -application starts a new transaction on this session, see +If the operation times out or fails with a non-retryable error, drivers MUST ignore all errors except for retryable +backpressure errors (see [Interaction With Client Backpressure](#interaction-with-client-backpressure)) from the +abortTransaction command. Errors from abortTransaction are meaningless to the application because they cannot do +anything to recover from the error. The transaction will ultimately be aborted by the server anyway either upon reaching +an age limit or when the application starts a new transaction on this session, see [Drivers ignore all abortTransaction errors](#drivers-ignore-all-aborttransaction-errors). #### endSession changes @@ -585,8 +587,9 @@ as well as any read or write commands attempted during the transaction. If a command fails with a retryable backpressure error and it includes `startTransaction:true`, the retried command MUST also include `startTransaction:true`. -If a command fails backpressure retries `MAX_RETRIES` times, it MUST not be retried again, including the -`commitTransaction` command. +If a command fails backpressure retries `MAX_RETRIES` times (see +[Overload Retry Policy](../client-backpressure/client-backpressure.md#overload-retry-policy)), it MUST NOT be retried +again, including the `commitTransaction` command. ### **Server Commands** @@ -1110,7 +1113,7 @@ objective of avoiding duplicate commits. ## **Changelog** -- 2025-12-18: Specify the handling of client backpressure. +- 2026-01-09: Specify the handling of client backpressure. - 2024-11-01: Clarify collection options inside txn. From c31f486911eb32a86f290051f5909b4fc658ac84 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 22 Jan 2026 16:06:44 -0700 Subject: [PATCH 45/55] Comments pt 2 --- .../client-backpressure.md | 133 ++++++++++-------- source/transactions/transactions.md | 41 +++--- 2 files changed, 98 insertions(+), 76 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 7c531b8d08..fd54b3c33d 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -26,8 +26,10 @@ connection spikes from overloading the system. #### Ingress Request Rate Limiter -A token bucket based system introduced in MongoDB 8.2 to admit a command or reject it with a System Overload Error at -the front door of a mongod/s. It aims to prevent command spikes from overloading the system. +A token bucket based system introduced in MongoDB 8.2 to admit a command or reject it with an overload error at the +front door of a mongod/s. It aims to prevent command spikes from overloading the system. + +The ingress operation rate limiter only applies to commands sent on authenticated connections. #### MongoTune @@ -42,13 +44,15 @@ metadata, or any of its arguments. #### SystemOverloadedError label -An error is considered overloaded if it includes the `SystemOverloadedError` label. This error label indicates that the -server is overloaded. If this error label is present, drivers will backoff before attempting a retry. +An error is considered to be an overload error if it contains the `SystemOverloadedError` label. This error label +indicates that the server is overloaded. If this error label is present, drivers will backoff before attempting a retry. + +#### Retryable Overload Error -#### Overload Errors +An error which indicates that it is an overload error (contains the `SystemOverloadedError` label) and contains the +`RetryableError` label. -An overload error is any command or network error that occurs due to a server overload. For example, when a request -exceeds the ingress request rate limit: +For example, when a request exceeds the ingress request rate limit, the following error may be returned: ```js { @@ -60,8 +64,8 @@ exceeds the ingress request rate limit: } ``` -Note that there is no guarantee that all errors with the `SystemOverloadedError` label can be retried nor that all -errors with the `RetryableError` label also have the `SystemOverloadedError` error label. +Note that an error is not guaranteed to contain both the SystemOverloadedError and the RetryableError labels, if it has +one of them. #### Goodput @@ -74,20 +78,24 @@ See [goodput](https://en.wikipedia.org/wiki/Goodput). #### Overload retry policy -This specification expands the driver's retry ability to all commands if the error indicates that it is both an overload -error and that it is retryable, including those not eligible for retry under the read/write retry policies such as +This specification expands the driver's retry ability to all commands if the error indicates that it is a retryable +overload error, including those not eligible for retry under the +[read](../retryable-writes/retryable-reads.md)/[write](../retryable-writes/retryable-writes.md) retry policies such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys the following rules: -1. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. +1. `attempt` is the execution attempt number (starting with 0). Note that `attempt` includes retries for errors that do + not contain the `SystemOverloadedError` error label (this might include attempts under other retry policies, see + [Interactions with Other Retry Policies](./client-backpressure.md#interaction-with-other-retry-policies)). +2. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. - The value is 0.1 and non-configurable. -2. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. -3. If a retry attempt fails with an error that does not include `SystemOverloadedError` label, drivers MUST deposit 1 +3. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. +4. If a retry attempt fails with an error that does not contain `SystemOverloadedError` label, drivers MUST deposit 1 token. - - An error without the `SystemOverloaded` error label indicates that the server is healthy enough to handle requests. - For the purposes of retry budget tracking, this counts as a success. -4. A retry attempt will only be permitted if: - 1. The error has both the `SystemOverloadedError` and the `RetryableError` error labels. + - An error that does not contain the `SystemOverloadedError` error label indicates that the server is healthy enough + to handle requests. For the purposes of retry budget tracking, this counts as a success. +5. A retry attempt will only be permitted if: + 1. The error is a retryable overload error. 2. We have not reached `MAX_RETRIES`. - The value of `MAX_RETRIES` is 5 and non-configurable. - This intentionally changes the behavior of CSOT which otherwise would retry an unlimited number of times within @@ -96,24 +104,39 @@ rules: [Client Side Operations Timeout](../client-side-operations-timeout/client-side-operations-timeout.md) specification. 4. A token can be consumed from the token bucket. -5. A retry attempt consumes 1 token from the token bucket. -6. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to - the following formula: `backoff = j * min(maxBackoff, baseBackoff * 2^(i - 1))` - - `i` is the retry attempt number (starting with 1 for the first retry). Note that `i` includes retries for errors - without the `SystemOverloaded` error label. +6. A retry attempt consumes 1 token from the token bucket. +7. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to + the following formula: `backoff = jitter * min(maxBackoff, baseBackoff * 2^(attempt - 1))` - `jitter` is a random jitter value between 0 and 1. - - `BASE_BACKOFF` is constant 100ms. - - `MAX_BACKOFF` is 10000ms. + - `baseBackoff` is constant 100ms. + - `maxBackoff` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. -7. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's +8. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's address to the list of deprioritized server addresses for server selection. -8. If the request is eligible for retry (as outlined in step 4) and is a retryable write: - 1. If the command is a part of a transaction, it MUST NOT be modified for retry, as outlined in the - [transactions](../transactions/transactions.md#interaction-with-retryable-writes). - 2. If the command is a not a part of a transaction, it MUST be modified for retry, as outlined +9. If the request is eligible for retry (as outlined in step 4) and is a retryable write: + 1. If the command is a part of a transaction, the instructions for command modification on retry for commands in + transactions MUST be followed, as outlined in the + [transactions](../transactions/transactions.md#interaction-with-retryable-writes) specification. + 2. If the command is a not a part of a transaction, the instructions for command modification on retry for commands + in for retryable writes MUST be followed, as outlined in the [retryable writes](../retryable-writes/retryable-writes.md) specifications. -9. If the request is not eligible for any retries, then the client MUST propagate the most recently encountered error - that does not have the NoWritesPerformed error label. +10. If the request is not eligible for any retries, then the client MUST propagate errors following the behaviors + described in the [retryable reads](../retryable-reads/retryable-reads.md) and + [retryable writes](../retryable-writes/retryable-writes.md). + +##### Relevant driver processes + +The retry policy defined above is only relevant for commands sent on authenticated connections, which + +- any user-facing API which wraps a server command (i.e., a CRUD command or runCommand) +- cursors and change streams (including getMores and killCursors) +- APIs which might perform multiple operations internally (such rewrapManyDataKey(), which performs a find() and a bulk + update) + +APIs not subject to the overload retry policy include commands executed on unauthenticated connections: + +- monitoring commands and round-trip time pingers +- commands executed during authentication (i.e., `saslStart`) #### Interaction with Other Retry Policies @@ -121,19 +144,17 @@ The retry policy in this specification is separate from the other retry policies [retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) specifications. Drivers MUST ensure: -- When a failed attempt is retried, backoff must be applied if and only if the attempt error has the +- Only errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. +- When a failed attempt is retried, backoff must be applied if and only if the attempt error contains the `SystemOverloadedError` label. -- All retryable errors apply backoff if they also contain a `SystemOverloadedError` label. This includes: - - Errors defined as retryable in the [retryable reads specification](../retryable-reads/retryable-reads.md). - - Errors defined as retryable in the [retryable writes specification](../retryable-writes/retryable-writes.md). - - Errors with the `RetryableError` label. -- Any command may be retried at most MAX_RETRIES (default=5) times, if any attempt has failed with a - `SystemOverloadedError`, regardless of which retry policy the current or future retry attempts are caused by. +- If an overload error is encountered, a command may be retried at most `MAX_RETRIES` times, regardless of which retry + policy the current or future retry attempts are caused by. If a command under CSOT has already retried more than + `MAX_RETRIES` times before encountering a retryable overload error, the command must not be retried further. #### Pseudocode The following pseudocode demonstrates the unified retry policy, combining the overload retry policy defined in this -specification with the existing retry policies from [Retryable Reads](../retryable-reads/retryable-reads.md) and +specification with the retry policies from [Retryable Reads](../retryable-reads/retryable-reads.md) and [Retryable Writes](../retryable-writes/retryable-writes.md). For brevity, some error handling details such as the handling of "NoWritesPerformed" are omitted. @@ -163,8 +184,8 @@ def execute_command_retryable(command, ...): token_bucket.deposit(tokens) return res except PyMongoError as exc: - is_retryable = is_retryable_write() or is_retryable_read() or (exc.has_error_label("RetryableError") and exc.has_error_label("SystemOverloadedError")) - is_overload = exc.has_error_label("SystemOverloadedError") + is_retryable = is_retryable_write() or is_retryable_read() or (exc.contains_error_label("RetryableError") and exc.contains_error_label("SystemOverloadedError")) + is_overload = exc.contains_error_label("SystemOverloadedError") # if a retry fails with an error which is not an overload error, deposit 1 token if attempt > 0 and not is_overload: @@ -200,10 +221,10 @@ def execute_command_retryable(command, ...): ### Token Bucket -The overload retry policy introduces a per-client token bucket to limit overload error retry attempts. Although the -server rejects excess commands as quickly as possible, doing so costs CPU and creates extra contention on the connection -pool which can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry attempts -during a prolonged overload. +The overload retry policy introduces a per-client [token bucket](https://en.wikipedia.org/wiki/Token_bucket) to limit +overload error retry attempts. Although the server rejects excess commands as quickly as possible, doing so costs CPU +and creates extra contention on the connection pool which can eventually negatively affect goodput. To reduce this risk, +the token bucket will limit retry attempts during a prolonged overload. The token bucket capacity is set to 1000 for consistency with the server. @@ -260,11 +281,11 @@ retry attempts for load shed commands. This specification does not define a form [Command Logging and Monitoring](../command-logging-and-monitoring/command-logging-and-monitoring.md) specification, drivers MUST guarantee that each `CommandStartedEvent` has either a correlating `CommandSucceededEvent` or `CommandFailedEvent` and that every "command started" log message has either a correlating "command succeeded" log -message or "command failed" log message. If the first attempt of a retryable command encounters a retryable error, -drivers MUST fire a `CommandFailedEvent` and emit a "command failed" log message for the retryable error and fire a -separate `CommandStartedEvent` and emit a separate "command started" log message when executing the subsequent retry -attempt. Note that the second `CommandStartedEvent` and "command started" log message may have a different -`connectionId`, since a server is reselected for a retry attempt. +message or "command failed" log message. If an attempt of a retryable command encounters a retryable error, drivers MUST +fire a `CommandFailedEvent` and emit a "command failed" log message for the retryable error and fire a separate +`CommandStartedEvent` and emit a separate "command started" log message when executing the subsequent retry attempt. +Note that for retries, `CommandStartedEvent`s and "command started" log message may have different `connectionId`s, +since a server is reselected for a retry attempt. ### Documentation @@ -285,8 +306,7 @@ number of errors users see during spikes or burst workloads and help prevent ret However, older drivers do not have this benefit. Drivers MUST document that: - Users SHOULD upgrade to driver versions that officially support backpressure to avoid any impacts of server changes. -- Users who do not upgrade might need to update application error handling to handle higher error rates of overload - errors. +- Users who do not upgrade might need to update application error handling to handle higher rates of overload errors. ## Test Plan @@ -333,8 +353,11 @@ for logical reasons. So, when determining the number of retries an operation sho The maximum retry attempt logic in this specification balances legacy retryability behavior with load-shedding behavior: -- Relying on either 1 or infinite retries (depending on whether CSOT enabled or not) preserves existing retry behavior - when no overload errors are encountered. +- Relying on either 1 or infinite retries (depending on whether CSOT enabled or not) preserves retry behaviors defined + in the [retryable reads](../retryable-reads/retryable-reads.md), + [retryable writes](../retryable-writes/retryable-writes.md) and + [CSOT](../client-side-operations-timeout/client-side-operations-timeout.md) specifications when no overload errors are + encountered. - Adjusting the maximum number of retry attempts to 5 if an overload error is returned from the server gives requests more opportunities to succeed and helps reduce application errors. - An alternative approach would be to retry once if we don't receive an overload error, in which case we'd retry 5 diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 1ad0aad4b2..36389cc1e4 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -48,10 +48,6 @@ including (but not limited to) creating, updating, or deleting databases, collec An error considered retryable by the [Retryable Writes Specification](../retryable-writes/retryable-writes.md). -#### Retryable Backpressure Error - -An error considered retryable by the [Client Backpressure Specification](../client-backpressure/client-backpressure.md). - #### Command Error A server response with ok:0. A server response with ok:1 and writeConcernError or writeErrors is not considered a @@ -350,8 +346,9 @@ transaction, drivers MUST NOT run the commitTransaction command. commitTransaction is a retryable write command. Drivers MUST retry once after commitTransaction fails with a retryable error, including a handshake network error, according to the Retryable Writes Specification, regardless of whether -retryWrites is set on the MongoClient or not. If a commitTransaction fails with a retryable backpressure error, the -command MUST be retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). +retryWrites is set on the MongoClient or not. If a commitTransaction fails with a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), the command MUST be +retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). When commitTransaction is retried, either by the driver's internal retry logic or explicitly by the user calling commitTransaction again, drivers MUST apply `w: majority` to the write concern of the commitTransaction command. If the @@ -391,11 +388,10 @@ abortTransaction is a retryable write command. Drivers MUST retry after abortTra according to the [Retryable Writes Specification](../retryable-writes/retryable-writes.md), including a handshake network error, regardless of whether retryWrites is set on the MongoClient or not. -If the operation times out or fails with a non-retryable error, drivers MUST ignore all errors except for retryable -backpressure errors (see [Interaction With Client Backpressure](#interaction-with-client-backpressure)) from the -abortTransaction command. Errors from abortTransaction are meaningless to the application because they cannot do -anything to recover from the error. The transaction will ultimately be aborted by the server anyway either upon reaching -an age limit or when the application starts a new transaction on this session, see +If the operation times out or fails with a non-retryable error, drivers MUST NOT propagate errors from the +`abortTransaction` abortTransaction command. Errors from abortTransaction are meaningless to the application because +they cannot do anything to recover from the error. The transaction will ultimately be aborted by the server anyway +either upon reaching an age limit or when the application starts a new transaction on this session, see [Drivers ignore all abortTransaction errors](#drivers-ignore-all-aborttransaction-errors). #### endSession changes @@ -561,11 +557,13 @@ a transaction. In MongoDB 4.0 the only supported retryable write commands within a transaction are commitTransaction and abortTransaction. Therefore drivers MUST NOT retry write commands within transactions even when retryWrites has been -enabled on the MongoClient, unless the server response is a retryable backpressure error. +enabled on the MongoClient, unless the server response is a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error). In addition, drivers MUST NOT add the RetryableWriteError label to any error that occurs during a write command within a transaction (excepting commitTransation and abortTransaction), even when retryWrites has been enabled on the -MongoClient, unless the server response is a retryable backpressure error. +MongoClient, unless the server response is a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error). Drivers MUST retry the commitTransaction and abortTransaction commands even when retryWrites has been disabled on the MongoClient. commitTransaction and abortTransaction are retryable write commands and MUST be retried according to the @@ -584,8 +582,9 @@ All commands in a transaction are subject to the This includes the initial command with `startTransaction:true`, the `abortTransaction` and `commitTransaction` commands, as well as any read or write commands attempted during the transaction. -If a command fails with a retryable backpressure error and it includes `startTransaction:true`, the retried command MUST -also include `startTransaction:true`. +If executing the first command within a transaction fails with a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), and another attempt +is executed, the command executed in the retry attempt must be treated as the first command within a transaction. If a command fails backpressure retries `MAX_RETRIES` times (see [Overload Retry Policy](../client-backpressure/client-backpressure.md#overload-retry-policy)), it MUST NOT be retried @@ -1063,12 +1062,12 @@ transaction. ### Majority write concern is used when retrying commitTransaction -Drivers SHOULD apply a majority write concern when retrying commitTransaction to guard against a transaction being -applied twice. - -Drivers SHOULD NOT modify the write concern on commit transaction commands when retrying a retryable backpressure error. -A retryable backpressure error indicates no work was performed by the server, and the rationale outlined in this section -for using majority write concern on retries is therefore irrelevant. +When retrying commitTransaction, drivers SHOULD use a majority write concern to ensure the transaction is not applied +twice. However, drivers SHOULD NOT use a majority write concern when retrying after a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), unless a prior +commitTransaction attempt failed with another error type. Since a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) means the server did +not perform any work, the need for majority write concern on retries is not relevant. Consider the following scenario: From 12ab836fbd1d31f7305595d8a6fd10ee869f7657 Mon Sep 17 00:00:00 2001 From: bailey Date: Wed, 28 Jan 2026 11:03:32 -0700 Subject: [PATCH 46/55] comments pt 3 --- .../client-backpressure.md | 30 ++++++++++--------- source/transactions/transactions.md | 8 ++--- 2 files changed, 19 insertions(+), 19 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index fd54b3c33d..e08c1554b0 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -106,10 +106,10 @@ rules: 4. A token can be consumed from the token bucket. 6. A retry attempt consumes 1 token from the token bucket. 7. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to - the following formula: `backoff = jitter * min(maxBackoff, baseBackoff * 2^(attempt - 1))` + the following formula: `backoff = jitter * min(MAX_BACKOFF, BASE_BACKOFF * 2^(attempt - 1))` - `jitter` is a random jitter value between 0 and 1. - - `baseBackoff` is constant 100ms. - - `maxBackoff` is 10000ms. + - `BASE_BACKOFF` is constant 100ms. + - `MAX_BACKOFF` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. 8. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's address to the list of deprioritized server addresses for server selection. @@ -117,12 +117,13 @@ rules: 1. If the command is a part of a transaction, the instructions for command modification on retry for commands in transactions MUST be followed, as outlined in the [transactions](../transactions/transactions.md#interaction-with-retryable-writes) specification. - 2. If the command is a not a part of a transaction, the instructions for command modification on retry for commands - in for retryable writes MUST be followed, as outlined in the - [retryable writes](../retryable-writes/retryable-writes.md) specifications. + 2. If the command is a not a part of a transaction, the instructions for command modification on retry for retryable + writes MUST be followed, as outlined in the [retryable writes](../retryable-writes/retryable-writes.md) + specification. 10. If the request is not eligible for any retries, then the client MUST propagate errors following the behaviors - described in the [retryable reads](../retryable-reads/retryable-reads.md) and - [retryable writes](../retryable-writes/retryable-writes.md). + described in the [retryable reads](../retryable-reads/retryable-reads.md), + [retryable writes](../retryable-writes/retryable-writes.md) and the + [transactions](../transactions/transactions.md) specifications. ##### Relevant driver processes @@ -144,12 +145,11 @@ The retry policy in this specification is separate from the other retry policies [retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) specifications. Drivers MUST ensure: -- Only errors with the `SystemOverloadedError` consume tokens from the token bucket before retrying. +- Only errors that contain the `SystemOverloadedError` consume tokens from the token bucket before retrying. - When a failed attempt is retried, backoff must be applied if and only if the attempt error contains the `SystemOverloadedError` label. -- If an overload error is encountered, a command may be retried at most `MAX_RETRIES` times, regardless of which retry - policy the current or future retry attempts are caused by. If a command under CSOT has already retried more than - `MAX_RETRIES` times before encountering a retryable overload error, the command must not be retried further. +- If an overload error is encountered, the maximum number of retries for any retry policy becomes MAX_RETRIES. If CSOT + is enabled and a command has already retried more than MAX_RETRIES times, it MUST NOT be retried further. #### Pseudocode @@ -206,7 +206,7 @@ def execute_command_retryable(command, ...): if is_overload: jitter = random.random() # Random float between [0.0, 1.0). - backoff = jitter * min(BASE_BACKOFF * (2 ** attempt - 1), MAX_BACKOFF) + backoff = jitter * min(MAX_BACKOFF, BASE_BACKOFF * 2 ** (attempt - 1)) # If the delay exceeds the deadline, bail early. if _csot.get_timeout(): @@ -351,7 +351,9 @@ for logical reasons. So, when determining the number of retries an operation sho - If the command ultimately would have failed if it had not been load shed by the server, returning an actionable error message is preferable to a generic overload error. -The maximum retry attempt logic in this specification balances legacy retryability behavior with load-shedding behavior: +The maximum retry attempt logic in this specification balances retry policies described in the +[retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) +specifications with load-shedding behavior: - Relying on either 1 or infinite retries (depending on whether CSOT enabled or not) preserves retry behaviors defined in the [retryable reads](../retryable-reads/retryable-reads.md), diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 36389cc1e4..51d0b46b8a 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -386,7 +386,9 @@ transaction, drivers MUST NOT run the abortTransaction command. abortTransaction is a retryable write command. Drivers MUST retry after abortTransaction fails with a retryable error according to the [Retryable Writes Specification](../retryable-writes/retryable-writes.md), including a handshake -network error, regardless of whether retryWrites is set on the MongoClient or not. +network error, regardless of whether retryWrites is set on the MongoClient or not. If a abortTransaction fails with a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), the command MUST be +retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). If the operation times out or fails with a non-retryable error, drivers MUST NOT propagate errors from the `abortTransaction` abortTransaction command. Errors from abortTransaction are meaningless to the application because @@ -586,10 +588,6 @@ If executing the first command within a transaction fails with a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), and another attempt is executed, the command executed in the retry attempt must be treated as the first command within a transaction. -If a command fails backpressure retries `MAX_RETRIES` times (see -[Overload Retry Policy](../client-backpressure/client-backpressure.md#overload-retry-policy)), it MUST NOT be retried -again, including the `commitTransaction` command. - ### **Server Commands** #### commitTransaction From 1b32e003e560fffc37abd0da5c26c67ae69381b5 Mon Sep 17 00:00:00 2001 From: bailey Date: Thu, 29 Jan 2026 16:15:50 -0700 Subject: [PATCH 47/55] comments pt 4 --- .../client-backpressure.md | 30 ++++++++++--------- 1 file changed, 16 insertions(+), 14 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index e08c1554b0..d59ff02dfb 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -29,7 +29,7 @@ connection spikes from overloading the system. A token bucket based system introduced in MongoDB 8.2 to admit a command or reject it with an overload error at the front door of a mongod/s. It aims to prevent command spikes from overloading the system. -The ingress operation rate limiter only applies to commands sent on authenticated connections. +The ingress request rate limiter only applies to commands sent on authenticated connections. #### MongoTune @@ -64,8 +64,8 @@ For example, when a request exceeds the ingress request rate limit, the followin } ``` -Note that an error is not guaranteed to contain both the SystemOverloadedError and the RetryableError labels, if it has -one of them. +Note that an error is not guaranteed to contain both the `SystemOverloadedError` and the `RetryableError` labels, if it +contains one of them. #### Goodput @@ -84,14 +84,13 @@ overload error, including those not eligible for retry under the updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys the following rules: -1. `attempt` is the execution attempt number (starting with 0). Note that `attempt` includes retries for errors that do - not contain the `SystemOverloadedError` error label (this might include attempts under other retry policies, see +1. `attempt` is the execution attempt number (starting with 0). Note that `attempt` includes retries for errors that + are not overload errors (this might include attempts under other retry policies, see [Interactions with Other Retry Policies](./client-backpressure.md#interaction-with-other-retry-policies)). 2. If the command succeeds on the first attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE` tokens. - The value is 0.1 and non-configurable. 3. If the command succeeds on a retry attempt, drivers MUST deposit `RETRY_TOKEN_RETURN_RATE`+1 tokens. -4. If a retry attempt fails with an error that does not contain `SystemOverloadedError` label, drivers MUST deposit 1 - token. +4. If a retry attempt fails with an error that is not an overload error, drivers MUST deposit 1 token. - An error that does not contain the `SystemOverloadedError` error label indicates that the server is healthy enough to handle requests. For the purposes of retry budget tracking, this counts as a success. 5. A retry attempt will only be permitted if: @@ -104,8 +103,10 @@ rules: [Client Side Operations Timeout](../client-side-operations-timeout/client-side-operations-timeout.md) specification. 4. A token can be consumed from the token bucket. + 5. The command is a write and [retryWrites](../retryable-writes/retryable-writes.md#retrywrites) is enabled or the + command is a read and [retryReads](../retryable-reads/retryable-reads.md#retryreads) is enabled. 6. A retry attempt consumes 1 token from the token bucket. -7. If the request is eligible for retry (as outlined in step 4), the client MUST apply exponential backoff according to +7. If the request is eligible for retry (as outlined in step 5), the client MUST apply exponential backoff according to the following formula: `backoff = jitter * min(MAX_BACKOFF, BASE_BACKOFF * 2^(attempt - 1))` - `jitter` is a random jitter value between 0 and 1. - `BASE_BACKOFF` is constant 100ms. @@ -134,7 +135,7 @@ The retry policy defined above is only relevant for commands sent on authenticat - APIs which might perform multiple operations internally (such rewrapManyDataKey(), which performs a find() and a bulk update) -APIs not subject to the overload retry policy include commands executed on unauthenticated connections: +Driver processes not subject to the overload retry policy include commands executed on unauthenticated connections: - monitoring commands and round-trip time pingers - commands executed during authentication (i.e., `saslStart`) @@ -145,11 +146,12 @@ The retry policy in this specification is separate from the other retry policies [retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) specifications. Drivers MUST ensure: -- Only errors that contain the `SystemOverloadedError` consume tokens from the token bucket before retrying. -- When a failed attempt is retried, backoff must be applied if and only if the attempt error contains the - `SystemOverloadedError` label. -- If an overload error is encountered, the maximum number of retries for any retry policy becomes MAX_RETRIES. If CSOT - is enabled and a command has already retried more than MAX_RETRIES times, it MUST NOT be retried further. +- Only overload errors consume tokens from the token bucket before retrying. +- When a failed attempt is retried, backoff must be applied if and only if the error is an overload error. +- If an overload error is encountered: + - Regardless of whether CSOT is enabled or not, the maximum number of retries for any retry policy becomes + `MAX_RETRIES`. + - If CSOT is enabled and a command has been retried at least `MAX_RETRIES` times, it MUST NOT be retried further. #### Pseudocode From a00a06354c3e906f4d53a56ba397fc0be82d810c Mon Sep 17 00:00:00 2001 From: bailey Date: Fri, 30 Jan 2026 15:39:19 -0700 Subject: [PATCH 48/55] comments pt 5 --- .../client-backpressure.md | 93 ++++++++++++++----- 1 file changed, 68 insertions(+), 25 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index d59ff02dfb..800004b7aa 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -10,6 +10,12 @@ ______________________________________________________________________ This specification adds the ability for drivers to automatically retry requests that fail due to server overload errors while applying backpressure to avoid further overloading the server. +The retry behaviors defined in this specification are separate from and complementary to the retry behaviors defined in +the [Retryable Reads](../retryable-reads/retryable-reads.md) and +[Retryable Writes](../retryable-writes/retryable-writes.md) specifications. This specification expands retry support to +all commands when specific server overload conditions are encountered, regardless of whether the command would normally +be retryable under those specifications. + ## META The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and @@ -64,8 +70,8 @@ For example, when a request exceeds the ingress request rate limit, the followin } ``` -Note that an error is not guaranteed to contain both the `SystemOverloadedError` and the `RetryableError` labels, if it -contains one of them. +Note that an error is not guaranteed to contain both the `SystemOverloadedError` and the `RetryableError` labels just +because it contains one of them. #### Goodput @@ -76,6 +82,24 @@ See [goodput](https://en.wikipedia.org/wiki/Goodput). ### Requirements for Client Backpressure +#### Driver mechanisms subject to the retry policy + +Commands sent by the driver to the server are subject to the retry policy defined in this specification unless the +command is included in the exceptions below. + +Driver commands not subject to the overload retry policy: + +- [monitoring commands](../server-discovery-and-monitoring/server-monitoring.md#monitoring) and + [round-trip time pingers](../server-discovery-and-monitoring/server-monitoring.md#measuring-rtt) (see + [Why not apply the overload retry policy to monitoring and RTT connections?](./client-backpressure.md#why-not-apply-the-overload-retry-policy-to-monitoring-and-rtt-connections)) +- commands executed during [authentication](../auth/auth.md) (see + [Why not apply the overload policy to authentication commands or reauthentication commands?](./client-backpressure.md#why-not-apply-the-overload-policy-to-authentication-commands-or-reauthentication-commands)) + +Note: Drivers communicate with [mongocryptd](../client-side-encryption/client-side-encryption.md#mongocryptd) using the +driver's `runCommand()` API. Consequently, drivers will implicitly apply the retry policy to communication with +mongocryptd, although practice the retry policy would never be unused because mongocryptd connections are not +authenticated. + #### Overload retry policy This specification expands the driver's retry ability to all commands if the error indicates that it is a retryable @@ -112,9 +136,10 @@ rules: - `BASE_BACKOFF` is constant 100ms. - `MAX_BACKOFF` is 10000ms. - This results in delays of 100ms, 200ms, 400ms, 800ms, and 1600ms before accounting for jitter. -8. If the request is eligible for retry (as outlined in step 4), the client MUST add the previously used server's - address to the list of deprioritized server addresses for server selection. -9. If the request is eligible for retry (as outlined in step 4) and is a retryable write: +8. If the request is eligible for retry (as outlined in step 5), the client MUST add the previously used server's + address to the list of deprioritized server addresses for + [server selection](../server-selection/server-selection.md). +9. If the request is eligible for retry (as outlined in step 5) and is a retryable write: 1. If the command is a part of a transaction, the instructions for command modification on retry for commands in transactions MUST be followed, as outlined in the [transactions](../transactions/transactions.md#interaction-with-retryable-writes) specification. @@ -126,20 +151,6 @@ rules: [retryable writes](../retryable-writes/retryable-writes.md) and the [transactions](../transactions/transactions.md) specifications. -##### Relevant driver processes - -The retry policy defined above is only relevant for commands sent on authenticated connections, which - -- any user-facing API which wraps a server command (i.e., a CRUD command or runCommand) -- cursors and change streams (including getMores and killCursors) -- APIs which might perform multiple operations internally (such rewrapManyDataKey(), which performs a find() and a bulk - update) - -Driver processes not subject to the overload retry policy include commands executed on unauthenticated connections: - -- monitoring commands and round-trip time pingers -- commands executed during authentication (i.e., `saslStart`) - #### Interaction with Other Retry Policies The retry policy in this specification is separate from the other retry policies defined in the @@ -157,8 +168,8 @@ specifications. Drivers MUST ensure: The following pseudocode demonstrates the unified retry policy, combining the overload retry policy defined in this specification with the retry policies from [Retryable Reads](../retryable-reads/retryable-reads.md) and -[Retryable Writes](../retryable-writes/retryable-writes.md). For brevity, some error handling details such as the -handling of "NoWritesPerformed" are omitted. +[Retryable Writes](../retryable-writes/retryable-writes.md). For brevity, some interactions with other specs are not +included, such as error handling with `NoWritesPerformed` labels. ```python # Note: the values below have been scaled down by a factor of 1000 because @@ -230,6 +241,10 @@ the token bucket will limit retry attempts during a prolonged overload. The token bucket capacity is set to 1000 for consistency with the server. +Each MongoClient instance MUST have its own token bucket. The token bucket MUST be created when the MongoClient is +initialized and exist for the lifetime of the MongoClient. Drivers MUST ensure the token bucket implementation is +thread-safe as it may be accessed concurrently by multiple operations. + #### Pseudocode The token bucket is implemented via a thread safe counter. For languages without atomics, this can be implemented via a @@ -263,9 +278,10 @@ class TokenBucket: #### Handshake changes -Drivers conforming to this spec MUST add `“backpressure”: True` to the connection handshake. This flag allows the server -to identify clients which do and do not support backpressure. Currently, this flag is unused but in the future the -server may offer different rate limiting behavior for clients that do not support backpressure. +Drivers conforming to this spec MUST add `"backpressure": True` to the +[connection handshake](../mongodb-handshake/handshake.rst). This flag allows the server to identify clients which do and +do not support backpressure. Currently, this flag is unused but in the future the server may offer different rate +limiting behavior for clients that do not support backpressure. #### Implementation notes @@ -299,7 +315,7 @@ since a server is reselected for a retry attempt. ### Backwards Compatibility The server's rate limiting can introduce higher error rates than previously would have been exposed to users under -periods of extreme server overload. The increased error rates is a tradeoff: given the choice between an overloaded +periods of extreme server overload. The increased error rate is a tradeoff: given the choice between an overloaded server (potential crash), or at minimum dramatically slower query execution time and a stable but lowered throughput with higher error rate as the server load sheds, we have chosen the latter. @@ -368,6 +384,33 @@ specifications with load-shedding behavior: times. The approach chosen allows for additional retries in scenarios where a non-overload error fails on a retry with an overload error. +### Why not apply the overload retry policy to monitoring and RTT connections? + +The ingress request rate limiter only applies to authenticated connections. Neither the +[monitoring connection](../server-discovery-and-monitoring/server-monitoring.md#monitoring) nor the +[RTT pinger](../server-discovery-and-monitoring/server-monitoring.md#measuring-rtt) use authentication, and consequently +will not encounter ingress operation rate limiter errors. + +It is conceivable that a driver attempting to establish a monitoring connection or RTT connection could encounter the +ingress connection rate limiter. However, in these scenarios, the driver already behaves in an appropriate manner. + +If an error is encountered, both the RTT connections and monitoring connections already retry. + +- The RTT pinger retries indefinitely until the monitor is reset. +- Monitoring failures will mark the server unknown, which will reset the monitor, triggering another monitoring request. + +Under most circumstances, both monitoring and RTT connections wait at least `minHeartbeatFrequencyMS` between `hello` +commands, ensuring delays between retries. The notable exception is monitoring connections retrying network errors +without waiting for `minHeartbeatFrequencyMS`, which is acceptable since re-establishing monitoring is the driver's top +priority when a monitoring connection disconnects. + +### Why not apply the overload policy to authentication commands or reauthentication commands? + +The ingress request rate limiter only applies to authenticated connections. The server does not consider a connection to +be authenticated until after the authentication workflow has completed and during reauthentication a connection is not +considered authenticated by the server. So, authentication and reauthentication commands will not hit the ingress +operation rate limiter. + ## Changelog - 2026-01-09: Initial version. From 4fe6b74e71d12a061b59f0cfa2da7f220fb167a4 Mon Sep 17 00:00:00 2001 From: bailey Date: Fri, 30 Jan 2026 16:20:16 -0700 Subject: [PATCH 49/55] runCommand handling --- .../client-backpressure.md | 33 +++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 800004b7aa..44c29e3e8b 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -129,6 +129,9 @@ rules: 4. A token can be consumed from the token bucket. 5. The command is a write and [retryWrites](../retryable-writes/retryable-writes.md#retrywrites) is enabled or the command is a read and [retryReads](../retryable-reads/retryable-reads.md#retryreads) is enabled. + - To retry `runCommand`, both [retryWrites](../retryable-writes/retryable-writes.md#retrywrites) and + [retryReads](../retryable-reads/retryable-reads.md#retryreads) must be enabled. See + [Why must both `retryWrites` and `retryReads` be enabled to retry runCommand?](client-backpressure.md#why-must-both-retrywrites-and-retryreads-be-enabled-to-retry-runcommand) 6. A retry attempt consumes 1 token from the token bucket. 7. If the request is eligible for retry (as outlined in step 5), the client MUST apply exponential backoff according to the following formula: `backoff = jitter * min(MAX_BACKOFF, BASE_BACKOFF * 2^(attempt - 1))` @@ -150,6 +153,8 @@ rules: described in the [retryable reads](../retryable-reads/retryable-reads.md), [retryable writes](../retryable-writes/retryable-writes.md) and the [transactions](../transactions/transactions.md) specifications. + - For the purposes of error propagation, `runCommand` is considered a write. See + [Why is runCommand considered a write for error propagation?](#why-is-runcommand-considered-a-write-for-error-propagation) #### Interaction with Other Retry Policies @@ -411,6 +416,34 @@ be authenticated until after the authentication workflow has completed and durin considered authenticated by the server. So, authentication and reauthentication commands will not hit the ingress operation rate limiter. +### Why must both `retryWrites` and `retryReads` be enabled to retry runCommand? + +[`runCommand`](../run-command/run-command.md) is not retryable under the +[retryable reads](../retryable-reads/retryable-reads.md) and [retryable writes](../retryable-writes/retryable-writes.md) +specifications and consequently it was not historically classified as a read or write command. + +The most flexible approach would be to inspect the user's command and determine if it is a read or a write. However, +this is problematic for two reasons: + +- The runCommand specification specifically forbids drivers from inspecting the user's command. +- `runCommand` is commonly used to execute commands of which the driver has no knowledge and therefore cannot determine + whether it is a read or write. + +Another option is to always consider `runCommand` retryable under the overload retry policy, regardless of the setting +of [`retryReads`](../retryable-reads/retryable-reads.md#retryreads) and +[`retryWrites`](../retryable-writes/retryable-writes.md#retrywrites). However, this behavior goes against a user's +expectations: if a user disables both options, they would expect no commands to be retried. + +Retrying `runCommand` only when both `retryReads` and `retryWrites` are enabled is a safe default that does not have the +pitfalls of either approach outlined by above: + +- This approach does not require drivers to inspect a user's command document. +- This approach will not retry commands if a user has disabled both `retryReads` and `retryWrites`. + +Additionally, both `retryReads` and `retryWrites` are enabled by default, so for most users `runCommand` will be +retried. This approach also prevents accidentally retrying a read command when only `retryWrites` is enabled, or +retrying a write command when only `retryReads` is enabled. + ## Changelog - 2026-01-09: Initial version. From fe4ff2558432c97ea4c096d3389b409005a11126 Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 2 Feb 2026 15:16:59 -0700 Subject: [PATCH 50/55] fmt --- .../client-backpressure.md | 12 ++++---- source/transactions/transactions.md | 28 +++++++++++++++---- 2 files changed, 30 insertions(+), 10 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 44c29e3e8b..56462febce 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -188,7 +188,7 @@ MAX_RETRIES = 5 def execute_command_retryable(command, ...): deprioritized_servers = [] attempt = 0 - attempts = if is_csot then math.inf else 1 + allowed_retries = if is_csot then math.inf else 1 while True: try: @@ -202,7 +202,9 @@ def execute_command_retryable(command, ...): token_bucket.deposit(tokens) return res except PyMongoError as exc: - is_retryable = is_retryable_write() or is_retryable_read() or (exc.contains_error_label("RetryableError") and exc.contains_error_label("SystemOverloadedError")) + is_retryable = (is_retryable_write(command, exc) + or is_retryable_read(command, exc) + or (exc.contains_error_label("RetryableError") and exc.contains_error_label("SystemOverloadedError"))) is_overload = exc.contains_error_label("SystemOverloadedError") # if a retry fails with an error which is not an overload error, deposit 1 token @@ -215,9 +217,9 @@ def execute_command_retryable(command, ...): attempt += 1 if is_overload: - attempts = MAX_RETRIES + allowed_retries = MAX_RETRIES - if attempt > attempts: + if attempt > allowed_retries: raise deprioritized_servers.append(server.address) @@ -365,7 +367,7 @@ The Node and Python drivers will provide the reference implementations. See The client backpressure retry loop is primarily concerned with spreading out retries to avoid retry storms. The exact sleep duration is not critical to the intended behavior, so long as we sleep at least as long as we say we will. -### Why override existing maximum number of retry attempt defaults for retryable reads and writes if a `SystemOverloadedError` is received? +### Why override existing maximum number of retry attempt defaults for retryable reads and writes if an overload error is received? Load-shedded errors indicate that the request was rejected by the server to minimize load, not that the command failed for logical reasons. So, when determining the number of retries an operation should attempt: diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 51d0b46b8a..5b4fb23e3c 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -350,6 +350,12 @@ retryWrites is set on the MongoClient or not. If a commitTransaction fails with [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), the command MUST be retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). +If the previous commitTransaction failed with a +[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), drivers MUST apply +the write concern modification rules as outlined in the +[Interaction With Client Backpressure](#interaction-with-client-backpressure) to determine whether or not to modify the +command's writeConcern. + When commitTransaction is retried, either by the driver's internal retry logic or explicitly by the user calling commitTransaction again, drivers MUST apply `w: majority` to the write concern of the commitTransaction command. If the transaction is using a [writeConcern](#writeconcern) that is not the server default (i.e. specified via @@ -564,8 +570,7 @@ enabled on the MongoClient, unless the server response is a In addition, drivers MUST NOT add the RetryableWriteError label to any error that occurs during a write command within a transaction (excepting commitTransation and abortTransaction), even when retryWrites has been enabled on the -MongoClient, unless the server response is a -[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error). +MongoClient. Drivers MUST retry the commitTransaction and abortTransaction commands even when retryWrites has been disabled on the MongoClient. commitTransaction and abortTransaction are retryable write commands and MUST be retried according to the @@ -581,9 +586,22 @@ all preceding commands in the transaction. All commands in a transaction are subject to the [Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly. -This includes the initial command with `startTransaction:true`, the `abortTransaction` and `commitTransaction` commands, +This includes the initial command with `startTransaction:true`, the `abortTransaction` and commitTransaction commands, as well as any read or write commands attempted during the transaction. +When a commitTransaction attempt fails with a retryable backpressure error: + +- If a commitTransaction attempt has already failed with an error that is not a + [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), drivers MUST + follow the instructions for modifying writeConcern as outlined in the [commitTransaction](#committransaction) + section for the next retry attempt. +- Otherwise, drivers MUST retry using the same write concern as was used for the failed commitTransaction attempt. + +A [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) means the server +did not perform any work, and consequently the +[need for majority write concern on retries](#majority-write-concern-is-used-when-retrying-committransaction) is not +relevant until a commitTransaction attempt fails with a retryable error that is not a retryable overload error. + If executing the first command within a transaction fails with a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), and another attempt is executed, the command executed in the retry attempt must be treated as the first command within a transaction. @@ -1060,8 +1078,8 @@ transaction. ### Majority write concern is used when retrying commitTransaction -When retrying commitTransaction, drivers SHOULD use a majority write concern to ensure the transaction is not applied -twice. However, drivers SHOULD NOT use a majority write concern when retrying after a +When retrying commitTransaction, drivers use a majority write concern to ensure the transaction is not applied twice. +However, drivers do use a majority write concern when retrying after a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), unless a prior commitTransaction attempt failed with another error type. Since a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) means the server did From 3e582049d9a2ae5983ac0f3cf3761cd11df9b1c6 Mon Sep 17 00:00:00 2001 From: bailey Date: Tue, 3 Feb 2026 11:20:44 -0700 Subject: [PATCH 51/55] sync review comments --- source/client-backpressure/client-backpressure.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 56462febce..9705756ef9 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -89,15 +89,17 @@ command is included in the exceptions below. Driver commands not subject to the overload retry policy: -- [monitoring commands](../server-discovery-and-monitoring/server-monitoring.md#monitoring) and +- [Monitoring commands](../server-discovery-and-monitoring/server-monitoring.md#monitoring) and [round-trip time pingers](../server-discovery-and-monitoring/server-monitoring.md#measuring-rtt) (see - [Why not apply the overload retry policy to monitoring and RTT connections?](./client-backpressure.md#why-not-apply-the-overload-retry-policy-to-monitoring-and-rtt-connections)) -- commands executed during [authentication](../auth/auth.md) (see - [Why not apply the overload policy to authentication commands or reauthentication commands?](./client-backpressure.md#why-not-apply-the-overload-policy-to-authentication-commands-or-reauthentication-commands)) + [Why not apply the overload retry policy to monitoring and RTT connections?](./client-backpressure.md#why-not-apply-the-overload-retry-policy-to-monitoring-and-rtt-connections)). +- Commands executed during + [connection establishment](../connection-monitoring-and-pooling/connection-monitoring-and-pooling.md#establishing-a-connection-internal-implementation) + and [reauthentication](../auth/auth.md) (see + [Why not apply the overload policy to authentication commands or reauthentication commands?](./client-backpressure.md#why-not-apply-the-overload-policy-to-authentication-commands-or-reauthentication-commands)). Note: Drivers communicate with [mongocryptd](../client-side-encryption/client-side-encryption.md#mongocryptd) using the driver's `runCommand()` API. Consequently, drivers will implicitly apply the retry policy to communication with -mongocryptd, although practice the retry policy would never be unused because mongocryptd connections are not +mongocryptd, although in practice the retry policy would never be used because mongocryptd connections are not authenticated. #### Overload retry policy From b2b5a4db6d6e7a83def1c81a2842994d3a3b907e Mon Sep 17 00:00:00 2001 From: bailey Date: Fri, 6 Feb 2026 10:15:36 -0700 Subject: [PATCH 52/55] Transaction spec comments --- source/transactions/transactions.md | 51 ++++++++++++++++------------- 1 file changed, 28 insertions(+), 23 deletions(-) diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 5b4fb23e3c..58451db5b3 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -350,11 +350,18 @@ retryWrites is set on the MongoClient or not. If a commitTransaction fails with [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), the command MUST be retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). -If the previous commitTransaction failed with a +When retrying a commitTransaction attempt, if the previous commitTransaction attempt failed with a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), drivers MUST apply the write concern modification rules as outlined in the [Interaction With Client Backpressure](#interaction-with-client-backpressure) to determine whether or not to modify the -command's writeConcern. +command's writeConcern. If the command failed with any other error, drivers MUST follow the rules for write concern +modification as outlined in [writeConcern for commitTransaction attempts](#writeconcern-for-committransaction-attempts). + +Drivers MUST add error labels to certain errors when commitTransaction fails. See the +[Error reporting changes](#error-reporting-changes) and [Error Labels](#error-labels) sections for a precise +description. + +##### writeConcern for commitTransaction attempts When commitTransaction is retried, either by the driver's internal retry logic or explicitly by the user calling commitTransaction again, drivers MUST apply `w: majority` to the write concern of the commitTransaction command. If the @@ -365,10 +372,6 @@ TransactionOptions during the `startTransaction` call or otherwise inherited), a until a socket timeout) if the majority write concern cannot be satisfied. See [Majority write concern is used when retrying commitTransaction](#majority-write-concern-is-used-when-retrying-committransaction). -Drivers MUST add error labels to certain errors when commitTransaction fails. See the -[Error reporting changes](#error-reporting-changes) and [Error Labels](#error-labels) sections for a precise -description. - #### abortTransaction This method aborts the currently active transaction on this session. Drivers MUST run an abortTransaction command with @@ -397,9 +400,9 @@ network error, regardless of whether retryWrites is set on the MongoClient or no retried as specified in [Interaction with Client Backpressure](#interaction-with-client-backpressure). If the operation times out or fails with a non-retryable error, drivers MUST NOT propagate errors from the -`abortTransaction` abortTransaction command. Errors from abortTransaction are meaningless to the application because -they cannot do anything to recover from the error. The transaction will ultimately be aborted by the server anyway -either upon reaching an age limit or when the application starts a new transaction on this session, see +abortTransaction command. Errors from abortTransaction are meaningless to the application because they cannot do +anything to recover from the error. The transaction will ultimately be aborted by the server anyway either upon reaching +an age limit or when the application starts a new transaction on this session, see [Drivers ignore all abortTransaction errors](#drivers-ignore-all-aborttransaction-errors). #### endSession changes @@ -586,21 +589,22 @@ all preceding commands in the transaction. All commands in a transaction are subject to the [Client Backpressure Specification](../client-backpressure/client-backpressure.md), and MUST be retried accordingly. -This includes the initial command with `startTransaction:true`, the `abortTransaction` and commitTransaction commands, -as well as any read or write commands attempted during the transaction. +This includes the initial command with `startTransaction:true`, the abortTransaction and commitTransaction commands, as +well as any read or write commands attempted during the transaction. When a commitTransaction attempt fails with a retryable backpressure error: - If a commitTransaction attempt has already failed with an error that is not a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), drivers MUST - follow the instructions for modifying writeConcern as outlined in the [commitTransaction](#committransaction) - section for the next retry attempt. -- Otherwise, drivers MUST retry using the same write concern as was used for the failed commitTransaction attempt. + follow the instructions for modifying writeConcern as outlined in the + [writeConcern for commitTransaction attempts](#writeconcern-for-committransaction-attempts) section for the next + retry attempt. +- Otherwise, drivers MUST retry using the same write concern as was used for the most recently failed commitTransaction + attempt. -A [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) means the server -did not perform any work, and consequently the -[need for majority write concern on retries](#majority-write-concern-is-used-when-retrying-committransaction) is not -relevant until a commitTransaction attempt fails with a retryable error that is not a retryable overload error. +See +[Majority write concern is used when retrying commitTransaction](#majority-write-concern-is-used-when-retrying-committransaction) +for discussion on why majority write concern is sometimes needed on commitTransaction retries. If executing the first command within a transaction fails with a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), and another attempt @@ -1079,11 +1083,6 @@ transaction. ### Majority write concern is used when retrying commitTransaction When retrying commitTransaction, drivers use a majority write concern to ensure the transaction is not applied twice. -However, drivers do use a majority write concern when retrying after a -[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), unless a prior -commitTransaction attempt failed with another error type. Since a -[retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) means the server did -not perform any work, the need for majority write concern on retries is not relevant. Consider the following scenario: @@ -1126,6 +1125,12 @@ custom write concerns. Excluding the edge case where has been disabled, drivers can readily trust that a majority write concern is durable, which achieves the primary objective of avoiding duplicate commits. +A [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) indicates that the +server performed no work when executing the command. As such, the scenario above is irrelevant for commitTransaction +attempts which have only failed with retryable backpressure errors. However, after a commitTransaction fails with an +error that is not a retryable backpressure error, we no longer have the guarantee that the server has performed no work, +and we must apply a majority write concern to prevent the transaction from being applied twice. + ## **Changelog** - 2026-01-09: Specify the handling of client backpressure. From 39d752ca2cd4c5d0e661d802851b4e8f0ea243b6 Mon Sep 17 00:00:00 2001 From: bailey Date: Mon, 9 Feb 2026 11:36:41 -0700 Subject: [PATCH 53/55] Valentin's comment --- source/client-backpressure/client-backpressure.md | 5 ++--- source/transactions/transactions.md | 8 ++++---- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 9705756ef9..72b7428227 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -106,7 +106,7 @@ authenticated. This specification expands the driver's retry ability to all commands if the error indicates that it is a retryable overload error, including those not eligible for retry under the -[read](../retryable-writes/retryable-reads.md)/[write](../retryable-writes/retryable-writes.md) retry policies such as +[read](../retryable-reads/retryable-reads.md)/[write](../retryable-writes/retryable-writes.md) retry policies such as updateMany, create collection, getMore, and generic runCommand. The new command execution method obeys the following rules: @@ -155,8 +155,7 @@ rules: described in the [retryable reads](../retryable-reads/retryable-reads.md), [retryable writes](../retryable-writes/retryable-writes.md) and the [transactions](../transactions/transactions.md) specifications. - - For the purposes of error propagation, `runCommand` is considered a write. See - [Why is runCommand considered a write for error propagation?](#why-is-runcommand-considered-a-write-for-error-propagation) + - For the purposes of error propagation, `runCommand` is considered a write. #### Interaction with Other Retry Policies diff --git a/source/transactions/transactions.md b/source/transactions/transactions.md index 58451db5b3..a00a68e64b 100644 --- a/source/transactions/transactions.md +++ b/source/transactions/transactions.md @@ -592,7 +592,7 @@ All commands in a transaction are subject to the This includes the initial command with `startTransaction:true`, the abortTransaction and commitTransaction commands, as well as any read or write commands attempted during the transaction. -When a commitTransaction attempt fails with a retryable backpressure error: +When a commitTransaction attempt fails with a retryable overload error: - If a commitTransaction attempt has already failed with an error that is not a [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error), drivers MUST @@ -1127,9 +1127,9 @@ objective of avoiding duplicate commits. A [retryable overload error](../client-backpressure/client-backpressure.md#retryable-overload-error) indicates that the server performed no work when executing the command. As such, the scenario above is irrelevant for commitTransaction -attempts which have only failed with retryable backpressure errors. However, after a commitTransaction fails with an -error that is not a retryable backpressure error, we no longer have the guarantee that the server has performed no work, -and we must apply a majority write concern to prevent the transaction from being applied twice. +attempts which have only failed with retryable overload errors. However, after a commitTransaction fails with an error +that is not a retryable overload error, we no longer have the guarantee that the server has performed no work, and we +must apply a majority write concern to prevent the transaction from being applied twice. ## **Changelog** From 7f18eceb46828cd0972795060e1b1d664ceb5dc2 Mon Sep 17 00:00:00 2001 From: Noah Stapp Date: Wed, 11 Feb 2026 12:35:09 -0500 Subject: [PATCH 54/55] Update changelogs for retryable reads and writes --- source/retryable-reads/retryable-reads.md | 3 +++ source/retryable-writes/retryable-writes.md | 3 +++ 2 files changed, 6 insertions(+) diff --git a/source/retryable-reads/retryable-reads.md b/source/retryable-reads/retryable-reads.md index 078f2dc4ca..16f80723c7 100644 --- a/source/retryable-reads/retryable-reads.md +++ b/source/retryable-reads/retryable-reads.md @@ -560,6 +560,9 @@ any customers experiencing degraded performance can simply disable `retryableRea ## Changelog +- 2026-02-11: Clarified that the retry logic and pseudocode does not include the modifications required by client + backpressure. + - 2026-12-08: Clarified that server deprioritization during retries must use a list of server addresses. - 2024-04-30: Migrated from reStructuredText to Markdown. diff --git a/source/retryable-writes/retryable-writes.md b/source/retryable-writes/retryable-writes.md index 1f1b806d89..0727185fb8 100644 --- a/source/retryable-writes/retryable-writes.md +++ b/source/retryable-writes/retryable-writes.md @@ -693,6 +693,9 @@ retryWrites is not true would be inconsistent with the server and potentially co ## Changelog +- 2026-02-11: Clarified that the retry logic and pseudocode does not include the modifications required by client + backpressure. + - 2026-12-08: Clarified that server deprioritization during retries must use a list of server addresses. - 2024-05-08: Add guidance for client-level `bulkWrite()` retryability. From eaa6ac1f1b5b720ccabc5e2165cb03ceac24b279 Mon Sep 17 00:00:00 2001 From: Noah Stapp Date: Fri, 13 Feb 2026 10:20:13 -0500 Subject: [PATCH 55/55] Add prose test for bucket capacity + starting value --- source/client-backpressure/client-backpressure.md | 2 +- source/client-backpressure/tests/README.md | 11 +++++++++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/source/client-backpressure/client-backpressure.md b/source/client-backpressure/client-backpressure.md index 72b7428227..388815724a 100644 --- a/source/client-backpressure/client-backpressure.md +++ b/source/client-backpressure/client-backpressure.md @@ -247,7 +247,7 @@ overload error retry attempts. Although the server rejects excess commands as qu and creates extra contention on the connection pool which can eventually negatively affect goodput. To reduce this risk, the token bucket will limit retry attempts during a prolonged overload. -The token bucket capacity is set to 1000 for consistency with the server. +The token bucket starts at its maximum capacity of 1000 for consistency with the server. Each MongoClient instance MUST have its own token bucket. The token bucket MUST be created when the MongoClient is initialized and exist for the lifetime of the MongoClient. Drivers MUST ensure the token bucket implementation is diff --git a/source/client-backpressure/tests/README.md b/source/client-backpressure/tests/README.md index 5e9b611a99..22efe29b20 100644 --- a/source/client-backpressure/tests/README.md +++ b/source/client-backpressure/tests/README.md @@ -57,3 +57,14 @@ Drivers should test that retries do not occur immediately when a SystemOverloade The sum of 5 backoffs is 3.1 seconds. There is a 1-second window to account for potential variance between the two runs. + +#### Test 2: Token Bucket Capacity is Enforced + +Drivers should test that retry token buckets are created at their maximum capacity and that that capacity is enforced. + +1. Let `client` be a `MongoClient`. +2. Assert that the client's retry token bucket is at full capacity and that the capacity is + `DEFAULT_RETRY_TOKEN_CAPACITY`. +3. Using `client`, execute a successful `ping` command. +4. Assert that the successful command did not increase the number of tokens in the bucket above + `DEFAULT_RETRY_TOKEN_CAPACITY`.