Skip to content

Conversation

@Mahalaxmibejugam
Copy link
Contributor

@Mahalaxmibejugam Mahalaxmibejugam commented Dec 29, 2025

HNS-Aware Behavior: For buckets with Hierarchical Namespace (HNS) enabled, mkdir now uses a create_folder API that creates directories similar to any filesystem

Recursive Creation: The method supports create_parents=True, allowing for the recursive creation of nested directories in a single API call when using the HNS-aware path.

HNS Bucket Creation: It is now possible to create a new HNS-enabled bucket directly by calling mkdir with the create_hns_bucket=True flag. This passes the necessary HNS configuration when creating the bucket.

Fallback Mechanism: If a bucket is not HNS-enabled, or if the operation is to create a bucket itself, the method gracefully falls back to the original mkdir implementation from the parent class. It also falls back if it fails to determine the bucket type, for instance, if the bucket doesn't exist.

Backward Compatibility for Bucket Creation: For backward compatibility, buckets can still be created by providing a path that includes a directory (e.g., bucket-name/some-dir) and setting create_parents=True. The system will fall back to the parent mkdir logic, which creates the bucket if it does not exist.

Cache Management: Upon successfully creating a new directory, the parent directory's listing in the dircache is updated to include the new directory entry. This avoids clearing the entire parent cache and ensures the cache remains consistent.

Testing: The test suite includes comprehensive checks for both HNS and non-HNS buckets, cache invalidation logic, and various failure scenarios.

Error Handling: Error handling is kept similar to raise a standard FileNotFoundError if a parent directory doesn't exist when create_parents=False.

@ankitaluthra1
Copy link
Collaborator

/gcbrun

# Note: Emulators may not fully support HNS features like real GCS.
# TODO: Update to create HNS bucket once mkdir supports creating HNS buckets.
gcs.mkdir(TEST_HNS_BUCKET)
gcs.mkdir(TEST_HNS_BUCKET, create_hns_bucket=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also use same parameter name as bucket i.e instead of create_hns_bucket, use hierarchical_namespace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intent of create_hns_bucket is only to create the bucket with HNS enabled. For creating folders inside HNS enabled bucket this parameter need not to be specified. Also kept it consistent with other parameters like create_parents

Renaming it to hierarchical_namespace creates confusion, implying that this setting is required even for folder operations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK --enable-hierarchical-namespace is the command line argument passed across different gcs client like gcloud and gcsfuse for bucket creation so it should not cause any confusion, imo we should also use similar naming convention to be consistent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we can use enable_hierarchical-namespace since it is consistent with other mkdir parameters, such as enable_versioning. This avoids the potential confusion that might arise with using just hierarchical_namespace for folder creation calls.

gcsfs.rm(placeholder_file)
gcsfs.mkdir(path1)

with gcs_hns_mocks(BucketType.HIERARCHICAL, gcsfs) as mocks:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding helper methods for assertions reduces the readability, consider DAMP over DRY in tests
https://stackoverflow.com/questions/6453235/what-does-damp-not-dry-mean-when-talking-about-unit-tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, some details like mock is called only once was hidden in the helper method even though all the parameters are being passed from the context of the test.

Added helper only for creating the request by passing the right set of parameters from the test(As the parameters passed are clear in the test it would have the necessary details in test itself) and asserted the call in the test itself.

@ankitaluthra1
Copy link
Collaborator

/gcbrun

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants