Skip to content

Conversation

@peterxcli
Copy link
Member

@peterxcli peterxcli commented Nov 20, 2025

What changes were proposed in this pull request?

Extends the expectedDataGeneration logic in Ozone Manager to support atomic "create-if-not-exists" semantics.

  • Introduces a new constant EXPECTED_GEN_CREATE_IF_NOT_EXISTS (value: -1) to represent "If-None-Match: *" semantics.
  • When expectedDataGeneration is set to -1, the validateAtomicRewrite logic (in both Create and Commit phases) strictly enforces that the target key must not exist, throwing KEY_ALREADY_EXISTS if it does.
  • This establishes the core OM support required for conditional "If-None-Match" requests, allowing upper layers (like S3 Gateway) to implement these features with minimal changes to the underlying protocol.

How S3 PUT with If-None-Match Header Leverages This

  1. S3 Gateway Layer
    1. Parse If-None-Match: * header.
    2. Set expectedDataGeneration = EXPECTED_GEN_CREATE_IF_NOT_EXISTS (-1).
    3. Pass to RpcClient.rewriteKey().
  2. OM Create Phase
    1. Validate expectedDataGeneration == -1.
    2. If key exists → throw KEY_ALREADY_EXISTS.
    3. Store -1 in open key metadata.
  3. OM Commit Phase
    1. Check expectedDataGeneration == -1 from open key.
    2. If key now exists (race condition) → throw KEY_ALREADY_EXISTS.
    3. Otherwise, commit key successfully.

Race Condition Handling

Using -1 ensures atomicity. If a concurrent write (Client B) commits between Client A's Create and Commit, Client A's commit fails the -1 validation check (key now exists), preserving strict create-if-not-exists semantics.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13963

How was this patch tested?

Added comprehensive unit and integration tests:

  1. Integration Tests (OzoneRpcClientTests):

    • rewriteFailsWhenKeyExists: Validates three scenarios:
      • Rewrite with -1 fails when key is already committed
      • Rewrite with -1 succeeds when key is open but not yet committed
      • After successful commit, subsequent rewrite with -1 fails
  2. OM Request Tests (TestOMKeyCreateRequest):

    • testCreateKeyExpectedGenCreateIfNotExistsKeyMissing: Create succeeds when key doesn't exist
    • testCreateKeyExpectedGenCreateIfNotExistsKeyAlreadyExists: Create fails with KEY_ALREADY_EXISTS when key exists
    • testCreateKeyExpectedGenMismatchReturnsKeyGenerationMismatch: Create fails with KEY_NOT_FOUND when generation mismatches
  3. OM Commit Tests (TestOMKeyCommitRequest):

    • testAtomicCreateIfNotExistsCommitKeyAbsent: Commit succeeds when key doesn't exist
    • testAtomicCreateIfNotExistsCommitKeyAlreadyExists: Commit fails with KEY_ALREADY_EXISTS when key exists

All tests validate both the success paths and error conditions, ensuring atomicity guarantees are preserved.

@peterxcli
Copy link
Member Author

@ivandika3 Could you take a look at whether this API change makes sense? Since rewrite is intended for existing keys, allowing a semantic like “don’t create if existed” can be confusing. On the other hand, adding another flag to the create-key request in the proto feels redundant. Do you have a better suggestion?

@ivandika3
Copy link
Contributor

Thanks @peterxcli for the patch, I'll take a look when I have time.

Involving @sodonnel since he's the original creator of Atomic rewrite.

@ivandika3 ivandika3 requested a review from sodonnel November 20, 2025 04:53
@ivandika3
Copy link
Contributor

ivandika3 commented Nov 20, 2025

Could you take a look at whether this API change makes sense? Since rewrite is intended for existing keys, allowing a semantic like “don’t create if existed” can be confusing. On the other hand, adding another flag to the create-key request in the proto feels redundant. Do you have a better suggestion?

I think this depends whether atomic rewrite can be cleanly reused for S3 conditional requests. However after another look, I think adding new KeyArgs optional attributes (e.g. allowOverwrite) might be better since atomic rewrite use case depends on the update ID only while S3 conditional requests will need to check ETag (which might need another KeyArgs attributes).

FYI, The concept of "generation" was loosely taken from GCP (https://docs.cloud.google.com/storage/docs/request-preconditions) which supports both request preconditions based on generation (GCP specific) and based on ETag (S3 compatible.

@peterxcli
Copy link
Member Author

whether atomic rewrite can be cleanly reused for S3 conditional requests.

@ivandika3 I opened #9334 and included a design doc.

Main idea:

  • For If non-match header: issue a rewrite-key request with expectDataGeneration = -1 to provide “CREATE IF NOT EXISTS” semantics.
  • For If match header: fetch key info from OM, validate the ETag at S3G, then set expectDataGeneration to the fetched version. This lets S3G perform optimized concurrency control by leveraging OM’s expectDataGeneration support during the rewrite.

@ivandika3
Copy link
Contributor

Thanks @peterxcli, left some comments in #9334, let's discuss the design there.

@github-actions github-actions bot added the stale label Dec 25, 2025
@peterxcli peterxcli removed the stale label Dec 25, 2025
…ion handling

- Implemented tests for `createKey` and `rewriteKey` methods to validate behavior when using the `EXPECTED_GEN_CREATE_IF_NOT_EXISTS` constant.
- Added scenarios for key creation when the key is absent and when it already exists.
- Enhanced the `rewriteFailsWhenKeyExists` test to cover cases for both committed and uncommitted keys.
- Updated error handling to ensure correct responses for key existence checks.
@apache apache deleted a comment from github-actions bot Jan 2, 2026
@peterxcli peterxcli marked this pull request as ready for review January 2, 2026 09:16
@peterxcli
Copy link
Member Author

peterxcli commented Jan 2, 2026

@ivandika3 @sodonnel would you like to review this? this a part of the S3 conditional PUT request design #9334.

cc @adoroszlai

Copy link
Contributor

@chungen0126 chungen0126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peterxcli for working on this. Could you please revert the unnecessary whitespace changes. They are creating a lot of noise in the diff.

I'm concerned that this change might pose a risk of a breaking change. Here is the client-side validation now:

  • CLI: The Command Line Interface performs a pre-check on existingKeyGeneration (ensuring it is > 0).
  • Java API: The Ozone Java API does not perform this pre-check.

This change introduces a potential regression. While the CLI pre-checks existingKeyGeneration, the Ozone Java API does not. Therefore, modifying the server-side handling of -1 breaks the existing behavior for Java API clients

I also have a question regarding this: Does the If-None-Match request apply to non-S3 interfaces as well?

  • If Yes: We need to remove the pre-checks and clarify the usage in the javadocs and the CLI description for put key.
  • If No (S3 only): We should probably add a pre-check to the public Ozone Java API to block invalid values and expose a separate, internal interface specifically for S3 use.

@peterxcli
Copy link
Member Author

If Yes: We need to remove the pre-checks and clarify the usage in the javadocs and the CLI description for put key.

MS. will add.

Copy link
Contributor

@ivandika3 ivandika3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterxcli Thanks for the patch. Overall LGTM.

Do you think we need a new OzoneManagerVersion for this? If we want to be more conservative, we can fail "If-None-Match" with unsupported exception if OM has not supported it.

if (existing == null) {
throw new OMException("Atomic rewrite is not allowed for a new key", KEY_NOT_FOUND);
}
if (!toCommit.getExpectedDataGeneration().equals(existing.getUpdateID())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can reuse expectedGen, also applied to the following exception. expectedGen can be pushed before the if condition.

Comment on lines +623 to +626
if (existing != null) {
throw new OMException("Key already exists",
OMException.ResultCodes.KEY_ALREADY_EXISTS);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you help check whether it's better to use a new ResultCodes specific to return S3 412 Precondition Failed? Although KEY_ALREADY_EXISTS does not seem to be handled by S3G currently, it might be safer to add a new result code just in case KEY_ALREADY_EXISTS will be used for other S3 operations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants