Skip to content

feat: implement OSSFS storage backend in sandbox lifecycle#340

Open
hittyt wants to merge 10 commits intoalibaba:mainfrom
hittyt:feat/support-ossfs-api
Open

feat: implement OSSFS storage backend in sandbox lifecycle#340
hittyt wants to merge 10 commits intoalibaba:mainfrom
hittyt:feat/support-ossfs-api

Conversation

@hittyt
Copy link
Collaborator

@hittyt hittyt commented Mar 4, 2026

This PR implements the OSSFS storage backend for the sandbox lifecycle, as detailed in OSEP-0003.

Key features:

  • Docker Runtime Integration: Implemented mount management with reference counting and on-demand ossfs mounting.
  • Enhanced Validation: Added rigorous checks for OSS buckets, endpoints, and credentials.
  • Automatic Cleanup: Ensures OSSFS paths are unmounted when the last dependent sandbox is deleted.

@jwx0925
Copy link
Collaborator

jwx0925 commented Mar 4, 2026

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a72893a549

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +1189 to +1192
current = self._ossfs_mount_ref_counts.get(mount_key, 0)
if current <= 1:
self._ossfs_mount_ref_counts.pop(mount_key, None)
should_unmount = True

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Unmount only when OSSFS mount has tracked references

The release logic treats current == 0 the same as current == 1 and proceeds to unmount. After process restart, _ossfs_mount_ref_counts is empty (it is initialized in-memory and not rebuilt from existing containers), so deleting one sandbox can unmount a shared OSSFS path still used by other running sandboxes. This can break active workloads by tearing down their underlying bind source unexpectedly.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@hittyt hittyt force-pushed the feat/support-ossfs-api branch from a72893a to ce216ac Compare March 4, 2026 13:17
@Pangjiping
Copy link
Collaborator

Please rebase main to resolve pyproject.toml format error

- Implement OSSFS volume support in Docker runtime with mount reference counting.
- Add comprehensive validation and unit tests for the new storage backend.
@hittyt hittyt force-pushed the feat/support-ossfs-api branch from ce216ac to c12516d Compare March 4, 2026 13:27
@Pangjiping Pangjiping added feature New feature or request component/server labels Mar 5, 2026
@hittyt hittyt force-pushed the feat/support-ossfs-api branch from aaf4e73 to 25eb4ed Compare March 5, 2026 05:32
return pvc_inspect_cache

@staticmethod
def _derive_oss_region(endpoint: str) -> Optional[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of resolving the region? It looks like the region is only required for ossfs 1.0 with sigv4 enabled, and it is not the region of the OSS bucket, but rather the region of client.

reference: https://help.aliyun.com/zh/oss/developer-reference/advanced-configurations?spm=a2c4g.11186623.help-menu-31815.d_1_2_7_2.5fe6686exm5Yta

region=cn-hangzhou:OSS Bucket请求地域标识符,挂载时添加-oregion=<region_id>,默认为空。在使用V4签名时必须添加该选项来作为发起请求地域的标识符。

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the oss doc described, sigv4 is ONLY used for version 1.0 mode, and the purpose of this function is prepare the region field for ossv1 command construction.
image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But region is the region of client,not region of oss bucket

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed region logic

…vior

Move Docker OSSFS-specific logic into a dedicated module to reduce branching noise in the core provisioning flow, and align OSSFS request validation with schema-driven constraints and platform support boundaries. This also fixes server pyproject TOML parsing so validation/test tooling can run reliably.

Made-with: Cursor
@hittyt hittyt requested a review from Pangjiping March 8, 2026 06:16
Make OSSFS mount preparation transactional by rolling back already prepared mount keys when a later mount in the same batch fails. Add a regression test to prevent ref-count leaks in multi-volume partial-failure scenarios.

Made-with: Cursor
@hittyt hittyt force-pushed the feat/support-ossfs-api branch from a332819 to 288e383 Compare March 8, 2026 06:53
Remove endpoint-derived region/sigv4 auto-injection for OSSFS 1.0 so region semantics stay explicitly controlled by caller options. Add assertions to keep v1 command generation free of implicit region flags.

Made-with: Cursor
Install and expose both JDK 8 and JDK 17 in Java E2E so Gradle toolchain resolution does not depend on foojay auto-download in CI. Configure Gradle installation paths explicitly to avoid intermittent Java 8 provisioning failures.

Made-with: Cursor
@hittyt hittyt force-pushed the feat/support-ossfs-api branch from 11e4b44 to 8338c4d Compare March 9, 2026 13:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component/server feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants