Skip to content

Configuration System Redesign - Inconsistent Command-Line Interface and Configuration File Handling #159

@beniroquai

Description

@beniroquai

Problem Description

After several meetings with @Mikrodoktor and @gokugiant, We identified several issues with the configuration. This is a long standing issue that needs a profound solution. The current configuration system in ImSwitch is heavily inconsistent and error-prone, with multiple overlapping systems for handling configuration parameters. This creates confusion for users, makes the codebase difficult to maintain, and leads to unexpected behavior.

CC'ing @ethanjli as the way we load configuration through docker compose adds an additional obstacle since we have to deal with remote folders / paths through the docker volume function.

Key Issues Identified

1. Inconsistent Parameter Naming Between CLI and Configuration

The command-line arguments don't match the internal configuration parameter names:

  • CLI: --config-file → Internal: default_config
  • CLI: --ext-data-folder → Internal: data_folder
  • CLI: --scan-ext-drive-mount → Internal: scan_ext_data_folder
  • CLI: --headless → Internal: is_headless
  • CLI: --http-port → Internal: http_port
  • CLI: --socket-port → Internal: socket_port

Example of the problem:

# In main.py documentation:
main(is_headless=True, default_config="/path/to/config.json")

# But CLI uses:
python -m imswitch --headless --config-file /path/to/config.json

2. Multiple Configuration Systems Running in Parallel

The codebase currently has at least 4 different configuration systems:

  1. Legacy global variables in imswitch/__init__.py:

    IS_HEADLESS = True
    DEFAULT_SETUP_FILE = None
    DEFAULT_CONFIG_PATH = None
    DEFAULT_DATA_PATH = None
    SOCKET_STREAM = True
    SCAN_EXT_DATA_FOLDER = False
    EXT_DRIVE_MOUNT = None
  2. New configuration class in imswitch/config.py:

    @dataclass
    class ImSwitchConfig:
        is_headless: bool = False
        http_port: int = 8001
        socket_port: int = 8002
        # ... more fields
  3. Setup-specific options in imcontrol/model/Options.py:

    @dataclass
    class Options:
        setupFileName: str
        recording: RecordingOptions
        watcher: WatcherOptions
  4. SetupInfo configuration in imcontrol/model/SetupInfo.py with dozens of device-specific configurations

3. Fragmented Configuration File Handling

Current configuration files include:

  • imcontrol_options.json (managed by configfiletools.py)
  • Setup JSON files in imcontrol_setups/ folder (e.g., example_virtual_microscope.json)
  • Environment variables and command-line arguments
  • Hardcoded defaults scattered throughout the codebase

File locations are inconsistent:

  • Config files: ~/ImSwitchConfig/config/
  • Setup files: ~/ImSwitchConfig/imcontrol_setups/
  • Data files: Variable, can be ~/ImSwitchConfig/data/ or external drives

4. Poor Error Handling and Validation

The configuration loading code in configfiletools.py contains numerous TODO comments and hacky workarounds:

'''
TODO: This is very hacky!! We should implement a proper interface for the different cases
e.g.:
- user file does not exist => create new file with default values ? 
- absolute path to user file that does not exist => error but check if the filename exists in the setup folder
- file exists but is corrupted => error and create backup of the corrupted file and create new
- file exists and is valid => load it
maybe there are more cases - this has to be tested more thoroughly thought through! @GokuGiant
'''

Current error handling is minimal:

except json.decoder.JSONDecodeError as e:
    print("Error: The setup file was corrupted and has been reset to default values.")
    print("Setup file: " + mPath)
    # ... just prints errors, doesn't help user recover

5. Inconsistent Directory and Path Management

The dirtools.py file has messy logic for determining configuration and data paths:

#TODO: THIS IS A MESS! We need to find a better way to handle the default data path
class UserFileDirs(FileDirs):
    Root = _baseUserFilesDir
    Config = os.path.join(_baseUserFilesDir, 'config')
    Data = os.path.join(_baseUserFilesDir, 'data')
    if DEFAULT_DATA_PATH is not None:
        Data = DEFAULT_DATA_PATH
    if SCAN_EXT_DATA_FOLDER and EXT_DRIVE_MOUNT is not None:
        # TODO: This is a testing workaround
        chosen_mount = pick_first_external_mount(EXT_DRIVE_MOUNT)
        if chosen_mount:
            Data = chosen_mount

6. Argument Parsing Issues

The main function signature is unwieldy with 10+ parameters:

def main(is_headless: bool = None, 
         default_config: str = None, 
         http_port: int = None, 
         socket_port: int = None, 
         ssl: bool = None, 
         config_folder: str = None,
         data_folder: str = None, 
         scan_ext_data_folder: bool = None, 
         ext_drive_mount: str = None, 
         with_kernel: bool = None):

Argument mapping is confusing:

arg_mapping = {
    'headless': 'is_headless',
    'config_file': 'default_config',
    'http_port': 'http_port',
    'socket_port': 'socket_port',
    'ssl': 'ssl',
    'config_folder': 'config_folder',
    'data_folder': 'data_folder',
    'scan_ext_data_folder': 'scan_ext_data_folder',
    'ext_drive_mount': 'ext_drive_mount',
    'with_kernel': 'with_kernel'
}

7. Documentation-Code Mismatch

The examples in main.py don't match the actual CLI arguments:

'''
To start imswitch in headless with a remote config file, you can add additional arguments:
main(is_headless=True, 
     default_config="/Users/bene/ImSwitchConfig/imcontrol_setups/example_virtual_microscope.json", 
     http_port=8001, ssl=True, data_folder="/Users/bene/Downloads")
'''

But the CLI usage is:

python -m imswitch --headless --config-file /path/to/config.json --http-port 8001 --no-ssl --ext-data-folder /path/to/data

8. Validation and Type Safety Issues

  • No validation of file paths before using them
  • Mixed use of strings and Path objects
  • No type checking for configuration values
  • Silent failures when configuration is invalid
# In configfiletools.py - directories are validated inconsistently
if hasattr(args, 'config_folder') and args.config_folder and os.path.isdir(args.config_folder):
    config.config_folder = args.config_folder
if hasattr(args, 'data_folder') and args.data_folder and os.path.isdir(args.data_folder):
    config.data_folder = args.data_folder

Impact on Users

For End Users:

  • Confusing CLI interface with inconsistent parameter names
  • Unclear configuration file locations and formats
  • Poor error messages when configuration fails
  • Difficult to troubleshoot configuration issues

For Developers:

  • Hard to understand which configuration system to use
  • Difficult to add new configuration options
  • Testing is complex due to multiple configuration sources
  • Maintenance burden from duplicated configuration logic

For System Administrators:

  • Difficult to deploy in containerized environments
  • Hard to manage configuration across multiple instances
  • Complex backup/restore procedures for configurations

Proposed Solution

Phase 1: Unify Configuration Interface (High Priority)

  1. Create a single configuration schema that covers all settings:

    @dataclass
    class UnifiedConfig:
        # Core application settings
        is_headless: bool = False
        
        # Network settings
        http_port: int = 8001
        socket_port: int = 8002
        ssl_enabled: bool = True
        
        # File system settings
        config_folder: Path = Path.home() / "ImSwitchConfig"
        data_folder: Path = Path.home() / "ImSwitchConfig" / "data"
        setup_file: Optional[Path] = None
        
        # Hardware settings
        scan_external_drives: bool = False
        external_drive_mount: Optional[Path] = None
        
        # Development settings
        with_jupyter_kernel: bool = False
        debug_mode: bool = False
  2. Standardize parameter naming across CLI, config files, and internal usage:

    # Proposed unified CLI interface
    imswitch --headless \
             --config-folder /path/to/config \
             --data-folder /path/to/data \
             --setup-file /path/to/setup.json \
             --http-port 8001 \
             --socket-port 8002 \
             --enable-ssl \
             --scan-external-drives \
             --external-drive-mount /media \
             --with-jupyter-kernel \
             --debug
  3. Implement proper validation with clear error messages:

    class ConfigValidationError(Exception):
        """Raised when configuration validation fails"""
        pass
    
    def validate_config(config: UnifiedConfig) -> List[str]:
        """Validate configuration and return list of errors"""
        errors = []
        
        if not config.config_folder.exists():
            errors.append(f"Config folder does not exist: {config.config_folder}")
        
        if config.setup_file and not config.setup_file.exists():
            errors.append(f"Setup file does not exist: {config.setup_file}")
        
        if not (1024 <= config.http_port <= 65535):
            errors.append(f"HTTP port must be between 1024-65535, got: {config.http_port}")
        
        return errors
  4. Create migration tools for existing configurations:

    def migrate_legacy_config() -> UnifiedConfig:
        """Migrate from legacy configuration files to new format"""
        pass

Phase 2: Simplify CLI Interface (Medium Priority)

  1. Reduce main() function parameters to accept only a config object:

    def main(config: Optional[UnifiedConfig] = None):
        """Main entry point with simplified interface"""
        if config is None:
            config = load_config_from_cli_and_files()
        
        # Validate configuration
        errors = validate_config(config)
        if errors:
            logger.error("Configuration validation failed:")
            for error in errors:
                logger.error(f"  - {error}")
            sys.exit(1)
        
        # Continue with application startup
  2. Implement configuration file discovery with clear precedence rules:

    # Configuration precedence (highest to lowest):
    # 1. Command line arguments
    # 2. Environment variables
    # 3. User config file (~/.imswitch/config.json)
    # 4. System config file (/etc/imswitch/config.json)
    # 5. Default values
  3. Add comprehensive help text and examples:

    imswitch --help
    # Should show clear examples and explain all options

Phase 3: Robust File Handling (Medium Priority)

  1. Implement proper configuration file validation:

    def load_and_validate_config_file(path: Path) -> Dict[str, Any]:
        """Load and validate configuration file with detailed error reporting"""
        try:
            with open(path) as f:
                data = json.load(f)
            
            # Validate against schema
            validate_config_schema(data)
            return data
            
        except json.JSONDecodeError as e:
            raise ConfigValidationError(f"Invalid JSON in {path}: {e}")
        except FileNotFoundError:
            raise ConfigValidationError(f"Configuration file not found: {path}")
  2. Add backup/recovery mechanisms for corrupted files:

    def backup_and_recover_config(config_path: Path):
        """Create backup and attempt recovery from corrupted config"""
        backup_path = config_path.with_suffix('.backup')
        
        # Create backup of corrupted file
        shutil.copy2(config_path, backup_path)
        
        # Try to restore from backup or create default
        try:
            restore_from_backup(config_path)
        except:
            create_default_config(config_path)
  3. Create clear fallback strategies when files are missing.

  4. Add configuration merging from multiple sources with clear precedence.

Phase 4: Documentation and Testing (Low Priority)

  1. Document the configuration system with clear examples
  2. Add comprehensive unit tests for all configuration scenarios
  3. Create migration guides for existing users
  4. Add configuration validation tools

Files That Need Changes

Core Configuration Files:

  • imswitch/__main__.py - Main entry point and argument parsing
  • imswitch/config.py - Central configuration management
  • imswitch/__init__.py - Remove legacy global variables

Directory and File Management:

  • imswitch/imcommon/model/dirtools.py - Clean up path handling
  • imswitch/imcontrol/model/configfiletools.py - Rewrite configuration file handling
  • imswitch/imcontrol/model/Options.py - Integrate with unified config system

Documentation and Examples:

  • main.py - Update examples and documentation
  • README.md - Add configuration documentation
  • docs/ - Create comprehensive configuration guide

Testing:

  • tests/test_config.py - New comprehensive configuration tests
  • tests/test_cli.py - CLI interface tests

Proposed Todos

I think this are some roughly formulated points

  • Complete audit of all configuration usage
  • Design unified configuration schema
  • Create migration strategy
  • Implement UnifiedConfig class
  • Create configuration validation
  • Update main entry point
  • Implement new CLI interface
  • Add configuration file discovery
  • Create backup/recovery mechanisms
  • Write comprehensive tests
  • Update documentation
  • Create migration tools

Backward Compatibility

I guess we can start from scratch.

Additional Context

The current system has evolved organically without a clear design, leading to the current fragmented state. A unified approach will:

Benefits:

  • Improve user experience with consistent, clear configuration
  • Reduce maintenance burden by eliminating duplicate code
  • Enable easier testing and validation with centralized configuration
  • Make the codebase more approachable for new contributors
  • Simplify deployment in containerized environments
  • Better error handling and troubleshooting

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingenhancementNew feature or request

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions