Summary:
As TransferQueue scenarios involve mixed data types (NPU tensors, CPU tensors, and non-tensor objects) . In the future, it may be beneficial to consider two potential enhancements to the Yuanrong data system: (1) a unified upper-layer interface to simplify access across heterogeneous storage backends, and (2) optional lifecycle management to improve reliability and resource safety.
Description
We ( @tianyi-ge and @dpj135 ) have put forward several suggestions for the Yuanrong data system. If these suggestions are implemented, the YuanrongStorageClient in TransferQueue would also need corresponding updates. The details of the proposals are as follows:
- Unified Interface: The current use
YuanrongStorageClient to manually coordinate between DSTensorClient (for NPU tensors) and KVClient (for other data). It could potentially be alleviated by a single abstraction that internally dispatches based on type metadata, simplifying the logic in YuanrongStorageClient code.
- Lifecycle Management: Add optional backend-managed lifetime control for tensors, ensuring data remains valid until explicitly released—even across worker failures or concurrent writes—thus preventing leaks and improving fault tolerance.