Skip to content

Conversation

@xiaoyu-work
Copy link
Collaborator

This pull request introduces significant improvements to the SD-LoRA data preprocessing pipeline, particularly enhancing support for DreamBooth-style datasets and improving the modularity and usability of related components. The main changes include adding explicit DreamBooth class image handling to both aspect_ratio_bucketing and image_resizing, refactoring dataset loading for HuggingFace datasets, and registering the SDLoRA pass in the configuration.

Enhancements to SD-LoRA data preprocessing and configuration:

DreamBooth support and preprocessing improvements:

  • Added explicit processing of DreamBooth class images in both aspect_ratio_bucketing and image_resizing, including resizing, bucket assignment, and output path management. This ensures that class images are handled consistently with instance images and that relevant metadata is tracked for downstream tasks. [1] [2]
  • Refactored the image_resizing function to use a helper for per-image processing, improved crop coordinate calculation, and ensured bucket assignment metadata is stored for all processed images. [1] [2] [3] [4]

Codebase and API improvements:

  • Updated the olive.data.component.sd_lora.__init__.py to explicitly export key preprocessing modules, making them more discoverable and easier to import elsewhere in the codebase.
  • Refactored ImageDataContainer to extract HuggingFace-specific parameters earlier in the dataset loading process, ensuring proper conversion and avoiding parameter leakage to downstream loaders.

Configuration updates:

  • Registered the SDLoRA pass in olive_config.json, specifying its dependencies, supported providers, and dataset requirements, enabling its use in Olive pipelines.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive SD-LoRA training pass that enables fine-tuning of Stable Diffusion models (SD 1.5, SDXL, and Flux) using LoRA adapters, with full support for DreamBooth-style training with prior preservation.

Key changes:

  • New SDLoRA pass with automatic model type detection and support for three diffusion architectures (SD 1.5, SDXL, Flux)
  • Enhanced data preprocessing pipeline with explicit DreamBooth class image handling in both aspect ratio bucketing and image resizing components
  • Validation improvements to DiffusersModelHandler with explicit model checking via model_index.json

Reviewed changes

Copilot reviewed 10 out of 12 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
olive/passes/diffusers/lora.py New SDLoRA pass implementation with training loops for SD/SDXL (UNet-based) and Flux (DiT-based) models, including prior preservation loss for DreamBooth
olive/data/component/sd_lora/aspect_ratio_bucketing.py Added explicit processing of DreamBooth class images including bucket assignment, resizing, and metadata tracking
olive/data/component/sd_lora/image_resizing.py Refactored to support class image processing with helper functions for per-image operations and crop coordinate calculation
olive/data/component/sd_lora/init.py Explicit exports of preprocessing modules for improved discoverability
olive/data/container/image_data_container.py Refactored HuggingFace dataset loading to extract ImageDataContainer-specific parameters before passing to base loader
olive/model/handler/diffusers.py Added validation via is_valid_model() checking for model_index.json, plus adapter_path property for LoRA weights
olive/olive_config.json Registered SDLoRA pass with GPU-only support and sd-lora extra dependencies
test/passes/diffusers/test_lora.py Comprehensive test suite covering SD 1.5, SDXL, Flux training, LoRA merging, and DreamBooth mode
test/passes/diffusers/conftest.py Test fixtures for mock models, accelerator, and test image folders
test/model/test_diffusers_model.py Updated tests to mock is_valid_model for network-free validation
test/passes/diffusers/init.py Empty init file for test package structure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants