Skip to content

[Enhancement] A suggestion regarding StorageManager initialization #6

@dpj135

Description

@dpj135

Description

We have written the startup script for Transferqueue + Datasystem backend as follows:

class Trainer:
    def __init__(self, config: dict):
        self.config = config
        self._initialize_transferqueue()

    def _initialize_transferqueue(self):
        # 1. Initialize TransferQueueController (single controller only)
        self.tq_controller = TransferQueueController.remote()

        # 2. Prepare necessary information of the controller
        self.tq_controller_info = process_zmq_server_info(self.tq_controller)

        tq_config = OmegaConf.create({}, flags={"allow_objects": True})  # Note: Need to generate a new DictConfig

        # with allow_objects=True to maintain ZMQServerInfo instance. Otherwise it will be flattened to dict
        tq_config.controller_info = self.tq_controller_info
        self.config = OmegaConf.merge(tq_config, self.config)

        # 3. Create TransferQueueClient
        self.tq_client = TransferQueueClient(
            client_id="Trainer",
            controller_info=self.tq_controller_info,
        )

        # 4. Connect to DataSystem
        self.tq_client.initialize_storage_manager(manager_type=self.config["manager_type"], config=self.config)

        return self.tq_client

We found TransferQueueClient requires controller_info during initialization and holds it, but StorageManager also needs to pass controller_info during initialization.

From the user's perspective, the relationship between StorageManager and controller may not be directly perceptible. Users might forget to add controller_info when passing in the configuration, resulting in ValueError. Perhaps we should tolerate this behavior instead of throwing an exception directly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions