-
Notifications
You must be signed in to change notification settings - Fork 18
Description
Problem Description
After several meetings with @Mikrodoktor and @gokugiant, We identified several issues with the configuration. This is a long standing issue that needs a profound solution. The current configuration system in ImSwitch is heavily inconsistent and error-prone, with multiple overlapping systems for handling configuration parameters. This creates confusion for users, makes the codebase difficult to maintain, and leads to unexpected behavior.
CC'ing @ethanjli as the way we load configuration through docker compose adds an additional obstacle since we have to deal with remote folders / paths through the docker volume function.
Key Issues Identified
1. Inconsistent Parameter Naming Between CLI and Configuration
The command-line arguments don't match the internal configuration parameter names:
- CLI:
--config-file→ Internal:default_config - CLI:
--ext-data-folder→ Internal:data_folder - CLI:
--scan-ext-drive-mount→ Internal:scan_ext_data_folder - CLI:
--headless→ Internal:is_headless - CLI:
--http-port→ Internal:http_port - CLI:
--socket-port→ Internal:socket_port
Example of the problem:
# In main.py documentation:
main(is_headless=True, default_config="/path/to/config.json")
# But CLI uses:
python -m imswitch --headless --config-file /path/to/config.json2. Multiple Configuration Systems Running in Parallel
The codebase currently has at least 4 different configuration systems:
-
Legacy global variables in
imswitch/__init__.py:IS_HEADLESS = True DEFAULT_SETUP_FILE = None DEFAULT_CONFIG_PATH = None DEFAULT_DATA_PATH = None SOCKET_STREAM = True SCAN_EXT_DATA_FOLDER = False EXT_DRIVE_MOUNT = None
-
New configuration class in
imswitch/config.py:@dataclass class ImSwitchConfig: is_headless: bool = False http_port: int = 8001 socket_port: int = 8002 # ... more fields
-
Setup-specific options in
imcontrol/model/Options.py:@dataclass class Options: setupFileName: str recording: RecordingOptions watcher: WatcherOptions
-
SetupInfo configuration in
imcontrol/model/SetupInfo.pywith dozens of device-specific configurations
3. Fragmented Configuration File Handling
Current configuration files include:
imcontrol_options.json(managed byconfigfiletools.py)- Setup JSON files in
imcontrol_setups/folder (e.g.,example_virtual_microscope.json) - Environment variables and command-line arguments
- Hardcoded defaults scattered throughout the codebase
File locations are inconsistent:
- Config files:
~/ImSwitchConfig/config/ - Setup files:
~/ImSwitchConfig/imcontrol_setups/ - Data files: Variable, can be
~/ImSwitchConfig/data/or external drives
4. Poor Error Handling and Validation
The configuration loading code in configfiletools.py contains numerous TODO comments and hacky workarounds:
'''
TODO: This is very hacky!! We should implement a proper interface for the different cases
e.g.:
- user file does not exist => create new file with default values ?
- absolute path to user file that does not exist => error but check if the filename exists in the setup folder
- file exists but is corrupted => error and create backup of the corrupted file and create new
- file exists and is valid => load it
maybe there are more cases - this has to be tested more thoroughly thought through! @GokuGiant
'''Current error handling is minimal:
except json.decoder.JSONDecodeError as e:
print("Error: The setup file was corrupted and has been reset to default values.")
print("Setup file: " + mPath)
# ... just prints errors, doesn't help user recover5. Inconsistent Directory and Path Management
The dirtools.py file has messy logic for determining configuration and data paths:
#TODO: THIS IS A MESS! We need to find a better way to handle the default data path
class UserFileDirs(FileDirs):
Root = _baseUserFilesDir
Config = os.path.join(_baseUserFilesDir, 'config')
Data = os.path.join(_baseUserFilesDir, 'data')
if DEFAULT_DATA_PATH is not None:
Data = DEFAULT_DATA_PATH
if SCAN_EXT_DATA_FOLDER and EXT_DRIVE_MOUNT is not None:
# TODO: This is a testing workaround
chosen_mount = pick_first_external_mount(EXT_DRIVE_MOUNT)
if chosen_mount:
Data = chosen_mount6. Argument Parsing Issues
The main function signature is unwieldy with 10+ parameters:
def main(is_headless: bool = None,
default_config: str = None,
http_port: int = None,
socket_port: int = None,
ssl: bool = None,
config_folder: str = None,
data_folder: str = None,
scan_ext_data_folder: bool = None,
ext_drive_mount: str = None,
with_kernel: bool = None):Argument mapping is confusing:
arg_mapping = {
'headless': 'is_headless',
'config_file': 'default_config',
'http_port': 'http_port',
'socket_port': 'socket_port',
'ssl': 'ssl',
'config_folder': 'config_folder',
'data_folder': 'data_folder',
'scan_ext_data_folder': 'scan_ext_data_folder',
'ext_drive_mount': 'ext_drive_mount',
'with_kernel': 'with_kernel'
}7. Documentation-Code Mismatch
The examples in main.py don't match the actual CLI arguments:
'''
To start imswitch in headless with a remote config file, you can add additional arguments:
main(is_headless=True,
default_config="/Users/bene/ImSwitchConfig/imcontrol_setups/example_virtual_microscope.json",
http_port=8001, ssl=True, data_folder="/Users/bene/Downloads")
'''But the CLI usage is:
python -m imswitch --headless --config-file /path/to/config.json --http-port 8001 --no-ssl --ext-data-folder /path/to/data8. Validation and Type Safety Issues
- No validation of file paths before using them
- Mixed use of strings and Path objects
- No type checking for configuration values
- Silent failures when configuration is invalid
# In configfiletools.py - directories are validated inconsistently
if hasattr(args, 'config_folder') and args.config_folder and os.path.isdir(args.config_folder):
config.config_folder = args.config_folder
if hasattr(args, 'data_folder') and args.data_folder and os.path.isdir(args.data_folder):
config.data_folder = args.data_folderImpact on Users
For End Users:
- Confusing CLI interface with inconsistent parameter names
- Unclear configuration file locations and formats
- Poor error messages when configuration fails
- Difficult to troubleshoot configuration issues
For Developers:
- Hard to understand which configuration system to use
- Difficult to add new configuration options
- Testing is complex due to multiple configuration sources
- Maintenance burden from duplicated configuration logic
For System Administrators:
- Difficult to deploy in containerized environments
- Hard to manage configuration across multiple instances
- Complex backup/restore procedures for configurations
Proposed Solution
Phase 1: Unify Configuration Interface (High Priority)
-
Create a single configuration schema that covers all settings:
@dataclass class UnifiedConfig: # Core application settings is_headless: bool = False # Network settings http_port: int = 8001 socket_port: int = 8002 ssl_enabled: bool = True # File system settings config_folder: Path = Path.home() / "ImSwitchConfig" data_folder: Path = Path.home() / "ImSwitchConfig" / "data" setup_file: Optional[Path] = None # Hardware settings scan_external_drives: bool = False external_drive_mount: Optional[Path] = None # Development settings with_jupyter_kernel: bool = False debug_mode: bool = False
-
Standardize parameter naming across CLI, config files, and internal usage:
# Proposed unified CLI interface imswitch --headless \ --config-folder /path/to/config \ --data-folder /path/to/data \ --setup-file /path/to/setup.json \ --http-port 8001 \ --socket-port 8002 \ --enable-ssl \ --scan-external-drives \ --external-drive-mount /media \ --with-jupyter-kernel \ --debug -
Implement proper validation with clear error messages:
class ConfigValidationError(Exception): """Raised when configuration validation fails""" pass def validate_config(config: UnifiedConfig) -> List[str]: """Validate configuration and return list of errors""" errors = [] if not config.config_folder.exists(): errors.append(f"Config folder does not exist: {config.config_folder}") if config.setup_file and not config.setup_file.exists(): errors.append(f"Setup file does not exist: {config.setup_file}") if not (1024 <= config.http_port <= 65535): errors.append(f"HTTP port must be between 1024-65535, got: {config.http_port}") return errors
-
Create migration tools for existing configurations:
def migrate_legacy_config() -> UnifiedConfig: """Migrate from legacy configuration files to new format""" pass
Phase 2: Simplify CLI Interface (Medium Priority)
-
Reduce main() function parameters to accept only a config object:
def main(config: Optional[UnifiedConfig] = None): """Main entry point with simplified interface""" if config is None: config = load_config_from_cli_and_files() # Validate configuration errors = validate_config(config) if errors: logger.error("Configuration validation failed:") for error in errors: logger.error(f" - {error}") sys.exit(1) # Continue with application startup
-
Implement configuration file discovery with clear precedence rules:
# Configuration precedence (highest to lowest): # 1. Command line arguments # 2. Environment variables # 3. User config file (~/.imswitch/config.json) # 4. System config file (/etc/imswitch/config.json) # 5. Default values
-
Add comprehensive help text and examples:
imswitch --help # Should show clear examples and explain all options
Phase 3: Robust File Handling (Medium Priority)
-
Implement proper configuration file validation:
def load_and_validate_config_file(path: Path) -> Dict[str, Any]: """Load and validate configuration file with detailed error reporting""" try: with open(path) as f: data = json.load(f) # Validate against schema validate_config_schema(data) return data except json.JSONDecodeError as e: raise ConfigValidationError(f"Invalid JSON in {path}: {e}") except FileNotFoundError: raise ConfigValidationError(f"Configuration file not found: {path}")
-
Add backup/recovery mechanisms for corrupted files:
def backup_and_recover_config(config_path: Path): """Create backup and attempt recovery from corrupted config""" backup_path = config_path.with_suffix('.backup') # Create backup of corrupted file shutil.copy2(config_path, backup_path) # Try to restore from backup or create default try: restore_from_backup(config_path) except: create_default_config(config_path)
-
Create clear fallback strategies when files are missing.
-
Add configuration merging from multiple sources with clear precedence.
Phase 4: Documentation and Testing (Low Priority)
- Document the configuration system with clear examples
- Add comprehensive unit tests for all configuration scenarios
- Create migration guides for existing users
- Add configuration validation tools
Files That Need Changes
Core Configuration Files:
imswitch/__main__.py- Main entry point and argument parsingimswitch/config.py- Central configuration managementimswitch/__init__.py- Remove legacy global variables
Directory and File Management:
imswitch/imcommon/model/dirtools.py- Clean up path handlingimswitch/imcontrol/model/configfiletools.py- Rewrite configuration file handlingimswitch/imcontrol/model/Options.py- Integrate with unified config system
Documentation and Examples:
main.py- Update examples and documentationREADME.md- Add configuration documentationdocs/- Create comprehensive configuration guide
Testing:
tests/test_config.py- New comprehensive configuration teststests/test_cli.py- CLI interface tests
Proposed Todos
I think this are some roughly formulated points
- Complete audit of all configuration usage
- Design unified configuration schema
- Create migration strategy
- Implement
UnifiedConfigclass - Create configuration validation
- Update main entry point
- Implement new CLI interface
- Add configuration file discovery
- Create backup/recovery mechanisms
- Write comprehensive tests
- Update documentation
- Create migration tools
Backward Compatibility
I guess we can start from scratch.
Additional Context
The current system has evolved organically without a clear design, leading to the current fragmented state. A unified approach will:
Benefits:
- Improve user experience with consistent, clear configuration
- Reduce maintenance burden by eliminating duplicate code
- Enable easier testing and validation with centralized configuration
- Make the codebase more approachable for new contributors
- Simplify deployment in containerized environments
- Better error handling and troubleshooting