Discussion about GRPO training setup for Qwen-Image-Edit-2509 or 2511 #52

Weistrass · 2026-01-14T05:29:51Z

Weistrass
Jan 14, 2026

Hi, thanks for sharing this great repository.

I noticed that your repo supports GRPO training for Qwen-Image-Edit-2509, and that it allows choosing between LoRA and full-parameter training modes. I have a few questions regarding training setup and customization:

Multi-reference image training

(1) I would like to train the model to generate a target image conditioned on 4–5 reference images simultaneously. In this case, do you think a configuration like 8 × 140 is sufficient for either LoRA or full training?

(2) Are there any practical differences in feasibility or stability between LoRA and full training for this multi-reference setting?

Configuration considerations

For the above setup, which configuration aspects should I pay special attention to? For example:

(1) Image resolution / sequence length

(2) GRPO-specific hyperparameters (e.g., rollout length, reward normalization)

(3) Any model- or data-related constraints specific to Qwen-Image-Edit-2509 or 2511

Adding custom reward functions:

(1) If I want to add custom reward functions (e.g., for reference consistency or visual alignment). Which part of the codebase should I modify or extend?

(2) Is there a recommended interface or example for registering new reward functions in the GRPO pipeline?

Thanks a lot for your time and for open-sourcing this work. Any guidance would be greatly appreciated.

Jayce-Ping · 2026-01-14T08:44:06Z

Jayce-Ping
Jan 14, 2026
Maintainer

Hi! Thanks for checking out this repo!

Multi-reference image training

(1) I would like to train the model to generate a target image conditioned on 4–5 reference images simultaneously. In this case, do you think a configuration like 8 × 140 is sufficient for either LoRA or full training?

That will be sufficient, I think. I haven't tried these many condition images yet. But for LoRA tuning, it should be enough. As for full, maybe it requires lower-resolution for condition images.

(2) Are there any practical differences in feasibility or stability between LoRA and full training for this multi-reference setting?

I suggest to train a LoRA with some easy reward and see if the reward goes well as expected. Then, turn to full-fine tuning. There is no much existing known feasibility/stability differences.

Configuration considerations

(1) Image resolution / sequence length

I suggest to set condition_image_size to a small number such as 256 if there are too many reference images. Since the Qwen-Image-Edit-2509 concatenate the images on the sequence dimension, it may cause OOM if the resolution and image numbers are both large.

(2) GRPO-specific hyperparameters (e.g., rollout length, reward normalization)

group_size can be set 16 or 24. This is enough for most of tasks (I don't see any task which needs larger yet). Don't worry too much about others. I will give you an example at the end of this response.

(3) Any model- or data-related constraints specific to Qwen-Image-Edit-2509 or 2511

No. They share the exact same architecture and just trained with different weights. If your task is not about like human face or realistism, using a random one is just fine.

Adding custom reward functions:

(1) If I want to add custom reward functions (e.g., for reference consistency or visual alignment). Which part of the codebase should I modify or extend?

(2) Is there a recommended interface or example for registering new reward functions in the GRPO pipeline?

Please check here for more infomation. Just implement your own logic of computing rewards in this file and set corresponding config (see below).

An example configuration file

Here I provide an example config file for you. Sorry that I don't have the GPU resources in the recent few days, I cannot verify it by myself. Please bring any feedback or issues. Thanks!

# Environment Configuration
launcher: "accelerate"  # Options: accelerate
config_file: config/accelerate_configs/fsdp2.yaml  # Use FSDP2 to shard the model as well. Switch to config/deepspeed/deepspeed_zero2.yaml if you have enough GPU memory.
num_processes: 8  # Number of processes to launch (overrides config file)
main_process_port: 29500
mixed_precision: "bf16"  # Options: no, fp16, bf16

run_name: null  # Run name (auto: {model_type}_{finetune_type}_{timestamp})
project: "Flow-Factory"  # Project name for logging
logging_backend: "wandb"  # Options: wandb, swanlab, none

# Data Configuration
data:
  dataset_dir: "dataset/sharegpt4o_image_mini"  # Path to dataset folder
  preprocessing_batch_size: 8  # Batch size for preprocessing
  dataloader_num_workers: 16  # Number of workers for DataLoader
  enable_preprocess: true  # Enable dataset preprocessing
  force_reprocess: true  # Force reprocessing of the dataset
  cache_dir: "~/.cache/flow_factory/datasets" # Cache directory for preprocessed datasets
  max_dataset_size: 1000  # Limit the maximum number of samples in the dataset

# Model Configuration
model:
  finetune_type: 'lora' # Options: full, lora
  lora_rank : 64
  lora_alpha : 128
  target_modules: "default" # Try default first, if OOM, then try ["to_k", "to_q", "to_v", "to_out.0"] for attention layers only.
  model_name_or_path: "Qwen/Qwen-Image-Edit-2509"  # Qwen/Qwen-Image-Edit-2509 or Qwen/Qwen-Image-Edit-2511
  model_type: "qwen-image-edit-plus"
  resume_path: null # Path to load previous checkpoint/lora adapter
  resume_training_state: false # Whether to resume training state, only effective when resume_path is a directory with full checkpoint

log:
  save_dir: "~/Flow-Factory"  # Directory to save model checkpoints and logs
  save_freq: 20  # Save frequency in epochs (0 to disable)
  save_model_only: true  # Save only the model weights (not optimizer, scheduler, etc.)

# Training Configuration
train:
  # Training settings
  trainer_type: 'grpo'
  enable_gradient_checkpointing: true  # Enable gradient checkpointing to save memory with extra compute. If OOM, set it `true`.

  # Image settings
  resolution: 512  # 512 is worth trying at first, if OOM, try 384 and then 256.
  condition_image_size: [512, 512]  # Keep as 512, or set the same as `resolution` above.
  
  # Batch and sampling
  per_device_batch_size: 1  # Qwen-Image-Edit-Plus accepts varying length of multi-image as condition, so the batch_size will always fallback to 1.
  group_size: 16  # Group size for GRPO sampling
  global_std: false  # Use global std for advantage normalization
  unique_sample_num_per_epoch: 48  # Unique samples per group
  gradient_step_per_epoch: 2  # Gradient steps per epoch
  
  # Clipping
  clip_range: 1.0e-4  # PPO/GRPO clipping range
  adv_clip_range: 5.0  # Advantage clipping range
  max_grad_norm: 1.0  # Max gradient norm for clipping
  
  # KL div
  kl_type: 'v-based' # Options: 'x-based', 'v-based'
  kl_beta: 0 # KL divergence beta. Set to 0 to disable it and save memory at first. If rewards grows as expected, set it to values like 0.04.
  ref_param_device: 'same_as_model' # Options: cpu, same_as_model

  # Denoising process
  num_inference_steps: 10  # Number of timesteps. For many tasks, 10 is good enough. Increase it to 20 if your task requires higher quality.
  guidance_scale: 4  # Guidance scale for sampling. 4 is following recommendation from Qwen-Image-Edit official model card.
  
  # Optimization
  seed: 42  # Random seed
  learning_rate: 3.0e-4  # Initial learning rate 3.0e-4 for LoRA and 1.0e-5 for full fine-tuning
  adam_weight_decay: 1.0e-4  # AdamW weight decay
  adam_betas: [0.9, 0.999]  # AdamW betas
  adam_epsilon: 1.0e-8  # AdamW epsilon

  # EMA
  ema_decay: 0.9  # EMA decay rate (0 to disable)
  ema_update_interval: 4  # EMA update interval (in epochs)

# Scheduler Configuration
scheduler:
  dynamics_type: "Flow-SDE"  # Options: Flow-SDE, Dance-SDE, CPS, ODE
  noise_level: 1.0  # Noise level for sampling
  num_train_steps: 1  # Number of noise steps
  train_steps: [1, 2, 3]  # Custom noise window, noise steps are randomly selected from this list during training
  seed: 42  # Scheduler seed (for noise step selection)

# Evaluation settings
eval:
  resolution: 512  # Evaluation resolution
  condition_image_size: [512, 512]  # Max condition image resolution, int or [height, width]
  auto_resize: true # Enable auto-resize to fit the condition images' aspect ratio during inference
  guidance_scale: 4  # Guidance scale for sampling
  num_inference_steps: 50  # Number of eval timesteps, 50 is recommended for better quality according to Qwen-Image-Edit official model card.
  per_device_batch_size: 1  # Eval batch size
  seed: 42  # Eval seed
  eval_freq: 20  # Eval frequency in epochs (0 to disable)

# Reward Model Configuration
rewards:
  - name: "visual_consistency"
    reward_model: "flow_factory.rewards.my_reward.VisualConsistencyRewardModel" # path to custom reward model
    batch_size: 16 # Batch size for reward model inference
    device: "cuda"
    dtype: bfloat16

# Optional Evaluation Reward Models
# eval_rewards:
#   - name: "text_alignment"
#     reward_model: "CLIP"
#     batch_size: 16
#     dtype: bfloat16
#     device: "cuda"

If you meet any issue, feel free to post here and I am happy to help🤗

0 replies

Weistrass · 2026-01-15T07:42:07Z

Weistrass
Jan 15, 2026
Author

感谢您的详细回复，目前我想快速验证一下这个框架对于qwen-edit-2509 grpo lora rank64的可行性，数据集我打算先用您提供的dataset/sharegpt4o_image_mini，然后奖励单纯就先用pickscore，不知道这种能否观察到您项目中提到的reward mean上升曲线。

0 replies

Jayce-Ping · 2026-01-15T07:54:43Z

Jayce-Ping
Jan 15, 2026
Maintainer

这个组合我已经验证过了，我把试验曲线贴到这里：

Training Reward

Eval Reward

还有一些Evaluation的例子，很明显可以看到图像是变得越来越符合PickScore的审美了。

Step 0

Step 80

Step 160

Step 240

配置也贴到这里：

# Environment Configuration
launcher: "accelerate"  # Options: accelerate
config_file: config/deepspeed/deepspeed_zero2.yaml  # Use FSDP2 to shard the model as well. Switch to config/deepspeed/deepspeed_zero2.yaml if you have enough GPU memory.
num_processes: 8  # Number of processes to launch (overrides config file)
main_process_port: 29500
mixed_precision: "bf16"  # Options: no, fp16, bf16

run_name: null  # Run name (auto: {model_type}_{finetune_type}_{timestamp})
project: "Flow-Factory"  # Project name for logging
logging_backend: "wandb"  # Options: wandb, swanlab, none

# Data Configuration
data:
  dataset_dir: "dataset/sharegpt4o_image_mini"  # Path to dataset folder
  preprocessing_batch_size: 8  # Batch size for preprocessing
  dataloader_num_workers: 16  # Number of workers for DataLoader
  enable_preprocess: true  # Enable dataset preprocessing
  force_reprocess: true  # Force reprocessing of the dataset
  cache_dir: "~/jcy/.cache/flow_factory/datasets" # Cache directory for preprocessed datasets
  max_dataset_size: 1000  # Limit the maximum number of samples in the dataset

# Model Configuration
model:
  finetune_type: 'lora' # Options: full, lora
  lora_rank : 64
  lora_alpha : 128
  target_modules: "default" # Options: all, default, or list of module names like ["to_k", "to_q", "to_v", "to_out.0"]
  model_name_or_path: "Qwen/Qwen-Image-Edit-2509"  # Qwen/Qwen-Image-Edit-2509 or Qwen/Qwen-Image-Edit-2511
  model_type: "qwen-image-edit-plus"
  resume_path: null # Path to load previous checkpoint/lora adapter
  resume_training_state: false # Whether to resume training state, only effective when resume_path is a directory with full checkpoint

log:
  save_dir: "~/jcy/Flow-Factory"  # Directory to save model checkpoints and logs
  save_freq: 20  # Save frequency in epochs (0 to disable)
  save_model_only: true  # Save only the model weights (not optimizer, scheduler, etc.)

# Training Configuration
train:
  # Training settings
  trainer_type: 'grpo'
  enable_gradient_checkpointing: false  # Enable gradient checkpointing to save memory with extra compute

  # Image settings
  resolution: 384  # Can be int or [height, width]
  auto_resize: true # Enable auto-resize to fit the condition images' aspect ratio during inference
  condition_image_size: [512, 512]  # Max condition image resolution, int or [height, width]
  
  # Batch and sampling
  per_device_batch_size: 1  # Qwen-Image-Edit-Plus accepts varying length of multi-image as condition, so the batch_size will always fallback to 1.
  group_size: 16  # Group size for GRPO sampling
  global_std: false  # Use global std for advantage normalization
  unique_sample_num_per_epoch: 48  # Unique samples per group
  gradient_step_per_epoch: 2  # Gradient steps per epoch
  
  # Clipping
  clip_range: 1.0e-4  # PPO/GRPO clipping range
  adv_clip_range: 5.0  # Advantage clipping range
  max_grad_norm: 1.0  # Max gradient norm for clipping
  
  # KL div
  kl_type: 'v-based' # Options: 'x-based', 'v-based'
  kl_beta: 0.04 # KL divergence beta
  ref_param_device: 'same_as_model' # Options: cpu, same_as_model

  # Denoising process
  num_inference_steps: 10  # Number of timesteps
  guidance_scale: 4  # Guidance scale for sampling
  
  # Optimization
  seed: 42  # Random seed
  learning_rate: 3.0e-4  # Initial learning rate
  adam_weight_decay: 1.0e-4  # AdamW weight decay
  adam_betas: [0.9, 0.999]  # AdamW betas
  adam_epsilon: 1.0e-8  # AdamW epsilon

  # EMA
  ema_decay: 0.9  # EMA decay rate (0 to disable)
  ema_update_interval: 4  # EMA update interval (in epochs)

# Scheduler Configuration
scheduler:
  dynamics_type: "Flow-SDE"  # Options: Flow-SDE, Dance-SDE, CPS, ODE
  noise_level: 1.0  # Noise level for sampling
  num_train_steps: 1  # Number of noise steps
  train_steps: [1, 2, 3]  # Custom noise window, noise steps are randomly selected from this list during training
  seed: 42  # Scheduler seed (for noise step selection)

# Evaluation settings
eval:
  resolution: 512  # Evaluation resolution
  condition_image_size: [512, 512]  # Max condition image resolution, int or [height, width]
  auto_resize: true # Enable auto-resize to fit the condition images' aspect ratio during inference
  guidance_scale: 4  # Guidance scale for sampling
  num_inference_steps: 40  # Number of eval timesteps
  per_device_batch_size: 1  # Eval batch size
  seed: 42  # Eval seed
  eval_freq: 20  # Eval frequency in epochs (0 to disable)

# Reward Model Configuration
rewards:
  - name: "pick_score"
    reward_model: "PickScore"
    batch_size: 16
    device: "cuda"
    dtype: bfloat16

当时我为了快，把训练分辨率设置在384，Eval的在512，和目前仓库中的example略有不同。因为分辨率较低，你可以看到step 0生成的图像有些扭曲，这好像是Qwen-Image-Edit在低分辨率自带的问题，不过RL在低分辨率能把这个问题修复过来。看到后面的训练图像都会好很多。这个问题在 X-GenGroup/PaCo-RL#2 也有讨论。你可以也试试把分辨率调低到512左右先跑，这样会快很多，可以很快验证能不能涨分

0 replies

Weistrass · 2026-01-15T08:16:53Z

Weistrass
Jan 15, 2026
Author

太感谢您的宝贵经验了，关于分辨率这个问题，qwen-edit-2509这个模型好像有过拟合现象的（拟合官方推理代码中默认的1024*1024分辨率），推理时，用1024x1024的分辨率得到的结果会比512x512好很多。不过我对你提到的这个讨论，在低分辨率，例如512训练完之后，能极大得缓解qwen-edit-2509这个模型的这种过拟合现象（即在非1024分辨率下的表现）比较感兴趣，我看你的建议是训练和采样时的分辨率应该一致，然后推理时的分辨率应小于等于训练时的分辨率？

0 replies

Jayce-Ping · 2026-01-15T08:21:50Z

Jayce-Ping
Jan 15, 2026
Maintainer

应该是：RL训练过程中采样和优化时采用低分辨率（512），在推理时采用正常的高分辨率即可（如1024）。比如我上面给出的例子是在384下训练的+512上评估，可以看到后面在384上的表现会变好。评估时在512也涨分，我肯定在1024上也会变好。PaCo-RL这篇文章发现了这种机制——低分辨率RL的收益可以一并转移到高分辨率。这是一个非常简单有效的trick，可以显著加速训练。可能存在的问题就是太低分辨率导致图像细节的缺失会影响Reward Model打分，所以选一个差不多低的最好，但一定没必要跑1024训练.

0 replies

Weistrass · 2026-01-15T08:27:35Z

Weistrass
Jan 15, 2026
Author

好的，感谢，我会按照您的建议快速尝试一下，希望后面能继续和您交流

0 replies

Jayce-Ping · 2026-01-16T01:38:36Z

Jayce-Ping
Jan 16, 2026
Maintainer

我刚看到FLUX2-klein发布了，有4B和9B的两个版本，有统一的推理Pipeline，而且支持文生图，（多）图生图。很激动，我会尽快加上支持。后面这种多图任务就不用训练Qwen-Image-Edit-Plus和FLUX2-dev这两个大家伙了😃

0 replies

Weistrass · 2026-01-16T02:05:35Z

Weistrass
Jan 16, 2026
Author

我刚看到FLUX2-klein发布了，有4B和9B的两个版本，有统一的推理Pipeline，而且支持文生图，（多）图生图。很激动，我会尽快加上支持。后面这种多图任务就不用训练Qwen-Image-Edit-Plus和FLUX2-dev这两个大家伙了😃

那真是太好了，感谢您为这个领域做出的贡献，我也希望能有小模型能快速验证算法。我之前有特意调研过能支持多图参考训练的模型，ominigen2其实有专门为支持多张参考图片设计对应的位置编码，然后模型大小只有7B（可惜配套的lora微调或者强化学习不如qwen-edit-2509完善），其他的能支持多图的模型，例如mosaic，dreamomniv2都是作者额外改进位置编码做训练才实现的，比较麻烦，并且都没有开源训练代码，无法很好得进行修改。之所以选择qwen-edit-2509，是因为这个基模性能更好（我希望能在某个方面能做出比闭源模型，例如nano pro相当或者更好的效果），同时训练框架，例如对lora，或者flow-grpo的支持更好，能更快速的搭建起来，避免深陷工程的泥塘里，所以您的项目对于我来说非常有意义。

0 replies

Weistrass · 2026-01-16T02:13:20Z

Weistrass
Jan 16, 2026
Author

这是昨天我按照您给的配置训练的结果，训练和采样分辨率我都是512，但测试时是1024，总体趋势是上升的，所以应该没有太大问题。不过有几个问题需要和您请教，例如您之前的配置，训练384，测试512要训练多久呢，我看您给的图片是300 step eval/reward都还没有变得平缓，还有上升的空间。我这边到160 step也同样没有收敛的意思，我看您写的是while循环，会一直训练下去，直到自己主动停止。我这边的算力资源最多能占用72小时，如果要加速，只能考虑多机多卡来训了。目前是单机8卡

0 replies

Jayce-Ping · 2026-01-16T02:21:03Z

Jayce-Ping
Jan 16, 2026
Maintainer

我这边是8*140G最多18个小时。所以最远我也不知道能到多少。实际上到200步多你可以看到图像质量已经有降低了，会出现过拟合PickScore偏好的风格，这个时候基本上就可以手动停止了。在配置文件可以为训练和评估分别采用两种奖励模型，比如用两个不同美学模型，刚开始可能会两个都涨分，直到评估开始掉分时，峰值的checkpoint应该就是最好的了。

0 replies

Jayce-Ping · 2026-01-16T02:24:14Z

Jayce-Ping
Jan 16, 2026
Maintainer

这个框架对模型封装的层次和diffusers几乎是一样的，我写代码时直接导入diffuers.pipeline来减少重复工作，不过这样会有明显滞后性。例如刚发布的FLUX2-klein，我需要等待最新版diffusers发布后才能兼容进来。另一个办法是直接copy 对应pipeline的代码拉进来改，我在考虑后面要不要都这么做。

0 replies

Weistrass · 2026-01-16T02:50:05Z

Weistrass
Jan 16, 2026
Author

我这边是8*140G最多18个小时。所以最远我也不知道能到多少。实际上到200步多你可以看到图像质量已经有降低了，会出现过拟合PickScore偏好的风格，这个时候基本上就可以手动停止了。在配置文件可以为训练和评估分别采用两种奖励模型，比如用两个不同美学模型，刚开始可能会两个都涨分，直到评估开始掉分时，峰值的checkpoint应该就是最好的了。

好的，感谢您的建议。

0 replies

Weistrass · 2026-01-16T02:57:12Z

Weistrass
Jan 16, 2026
Author

这个框架对模型封装的层次和diffusers几乎是一样的，我写代码时直接导入diffuers.pipeline来减少重复工作，不过这样会有明显滞后性。例如刚发布的FLUX2-klein，我需要等待最新版diffusers发布后才能兼容进来。另一个办法是直接copy 对应pipeline的代码拉进来改，我在考虑后面要不要都这么做。

我没有太多构建大型项目的经验，不过我认为可以先尝试一下工作量，仅先支持最热门的，并且diffusers还没支持的试试（尽量避免破坏原有的可拓展性）。我实际的科研经验来看，很多事可能很难一开始就想的特别清楚。

0 replies

Weistrass · 2026-01-17T04:09:18Z

Weistrass
Jan 17, 2026
Author

您好，我在训练模型支持更多参考图片的时候，发现如果训练和测试的分辨率都为512，group size为16，最多支持4张参考图片的训练，如果降低group size到8，可以支持最多到6张参考图片的训练。不过这个group size从16降低到8，对性能影响会有多大呢？希望能从您这得到一些经验。我是希望模型能够在更多参考图片的情况下仍然保持一个不错的性能，我不确定训练时最大参考图片为4的情况下，测试时泛化到7-8张参考图片的效果是否会好。所以面临一个参考图片数量和group size的权衡，分辨率目前不太能降低，担心在384的情况下训出来的模型在1024测试下表现不好，毕竟差距有点大。

0 replies

Jayce-Ping · 2026-01-17T04:20:18Z

Jayce-Ping
Jan 17, 2026
Maintainer

我不太明白为什么（分辨率+参考图像数量）和group_size会有冲突。分辨率+参考图像数量影响显存占用，而group_size不管多大都不影响峰值显存。但是增大group_size需要更多时间采样，训练效率更低，分辨率+参考图像数量也会影响训练效率。所以我猜你想表达的瓶颈应该是72h的训练时间。如果是训练时间的话，其实也不用很担心，可以先训72h，然后再开一个训练直接加载之前的检查点接着训练就行了。

0 replies

Weistrass · 2026-01-17T09:33:23Z

Weistrass
Jan 17, 2026
Author

我尝试用了fsdp2.yaml，这样确实节省了很多显存，能够在group size为16 然后6张参考图片的情况下运行起来，感谢您的建议。我这边还碰到一个情况就是，我尝试在多张参考图片的情况下，训练模型保持风格和主体的一致性，然而我的训练数据中，并非每一个样例都有风格参考，但每一个样例都有主体参考。因此在设计关于风格的奖励函数方面，我目前打算只对训练样本中含有风格参考图片的才去计算风格奖励，不包含风格参考图片的训练样本不参与奖励计算。但我还不太确定这是否有什么隐患，想听听您的建议。

0 replies

Jayce-Ping · 2026-01-17T13:36:40Z

Jayce-Ping
Jan 17, 2026
Maintainer

并非每一个样例都有风格参考，但每一个样例都有主体参考。因此在设计关于风格的奖励函数方面，我目前打算只对训练样本中含有风格参考图片的才去计算风格奖励，不包含风格参考图片的训练样本不参与奖励计算。

这个可能取决于你具体加权的方式了。因为advantage是基于reward算的。如果部分样本不参与reward计算，是直接置零还是取均值呢？如果置零则会导致这类样本总是负样本，越学离他们越远；如果置均值，则优势全是0，模型不会从他们这学到东西，可以直接从数据集删去。

@Weistrass 此外，我刚刚上线了FLUX2-klein的支持。大概试了一下，训练非常快，而且占用显存很少。4B模型的占用不到24G。模型性能肉眼可见感觉很好。我感觉140G*8能训练LoRA的话，跑满10张参考图像完全没问题，甚至可以试试全参+fsdp2.

我目前简单验证了一下FLUX.2-klein-base-9B，用的就是example/grpo/lora/flux2_klein.yaml。涨分有点夸张：

中间那一部分drop可能是学习率过大，或者噪声过大导致的不稳定，但整体上涨非常迅速。目前参数都是用的我经验上给出的通用配置，对于具体任务可能还需要具体调整。

0 replies

Weistrass · 2026-01-18T03:17:18Z

Weistrass
Jan 18, 2026
Author

好的，感谢，我这边会试试flux.2 kelin的。然后您说的那个不参考奖励计算的样例，取均值还是置零这个确实是个问题，我之前没有考虑到这点。所以现在把所有训练数据都设置成同时都有主体和风格的，都参与奖励计算了，避免了这种偏差问题。但我现在在训练过程中，先加载了我之前基于qwen-edit-2511训练了4个epoch的sft版本的lora权重，再这个基础上来继续在grpo进行lora训练，发现一开始的eval效果并不好，这点很奇怪，毕竟我微调了4个epoch后，模型的性能并不差，我只是想借助grpo再看看还有没有提升，结果反而一开始就表现很差，感觉都没有在sft版本的基础上进行学习。我sft lora用的是https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/qwen_image/model_training/lora/Qwen-Image-Edit-2511.sh的代码，我发现这边lora是只需设置lora_rank，并没有指定您这边在yaml中使用到的lora_alpha参数，所以不确定是不是这个原因。
下图是DiffSynth-Studio中关于sft lora训练的相关配置：

希望能得到您的解答，我之所以这么做，是想看看使用sft冷启动后，再grpo，效果会不会更好，但从目前来看，并没有。

0 replies

Jayce-Ping · 2026-01-18T04:51:31Z

Jayce-Ping
Jan 18, 2026
Maintainer

按理来说sft冷启动后再rl是标准做法，效果也应该会好。

你在sft微调4个epoch后有做过eval吗？和RL第0步给出的eval结果一致吗？如果一致的话，应该不是加载的问题，可能就是sft没训好。如果不一致，可能是模型加载的哪里出了问题，我这里加载LoRA采用是和PeftModel.from_pretrained的接口，与PeftModel.save_pretrained保持对称：

Save

Flow-Factory/src/flow_factory/models/abc.py

Lines 1056 to 1082 in 59eb12a

    
           if not isinstance(unwrapped, PeftModel): 
        
               logger.warning(f"Model is not a PeftModel, falling back to full save.") 
        
               self._save_full_model( 
        
                   model, 
        
                   save_directory, 
        
                   safe_serialization=True, 
        
               ) 
        
               return 
        
           # If not sharded save, use standard save_pretrained 
        
           if self._requires_collective_state_dict(): 
        
               # Handle sharded save 
        
               # Gather all params before saving 
        
               state_dict = self.get_state_dict( 
        
                   model, 
        
                   unwrap=True, 
        
                   state_dict_keys=self.lora_keys, 
        
                   ignore_frozen_params=True, 
        
               ) 
        
               if self.accelerator.is_main_process: 
        
                   unwrapped.save_pretrained( 
        
                       save_directory, 
        
                       state_dict=state_dict, 
        
                   ) 
        
           else: 
        
               if self.accelerator.is_main_process: 
        
                   unwrapped.save_pretrained(save_directory)

Load

Flow-Factory/src/flow_factory/models/abc.py

Lines 1349 to 1356 in 59eb12a

    
           if not isinstance(unwrapped, PeftModel): 
        
               # Load as PeftModel 
        
               unwrapped = PeftModel.from_pretrained(unwrapped, comp_path, is_trainable=True) 
        
               unwrapped.set_adapter("default") 
        
               setattr(self, comp_name, unwrapped) 
        
           else: 
        
               # Load to existing adapter 
        
               unwrapped.load_adapter(comp_path, unwrapped.active_adapter)

关于lora_alpha，我在你给的链接找到了对应默认设置：

https://github.com/modelscope/DiffSynth-Studio/blob/55e8346da3fd725e12ca9f3251eb79dd75469a25/diffsynth/diffusion/training_module.py#L29-L40

DIffSynth-Studio采用了lora_alpha默认等于lora_rank的设置，你在config修改一下再试试看。

对了，eval的话得看eval_samples，这个是最后模型用于推理的表现；train_samples普遍质量较低，因为SDE加噪、低分辨率以及少步数生成等原因，但这对RL训练时够了的。

0 replies

Weistrass · 2026-01-18T06:32:14Z

Weistrass
Jan 18, 2026
Author

我检查了以下，sft训练没问题，在多张参考图片的情况下是清楚的（如下图清晰图片）。但eval_samples有问题，图片是模糊不清楚的。具体如下：

然后在加载lora权重时，我发现一个警告：

这个警告的含义是：在加载LoRA检查点时，发现检查点文件中缺少某些预期的适配器权重键。警告分析警告内容 Found missing adapter keys while loading the checkpoint: ['base_model.model.transformer_blocks.0.img_mod.1.lora_A.default.weight', 'base_model.model.transformer_blocks.0.img_mod.1.lora_B.default.weight', 'base_model.model.transformer_blocks.0.attn.to_q.lora_A.default.weight', ...] 含义解释缺失适配器键: 检查点文件中缺少某些LoRA权重具体模块: 包括注意力机制 (to_q, to_k, to_v, add_k_proj) 和图像模块 (img_mod.1) 影响: 这些模块的LoRA适配器将使用随机初始化的权重，而不是从检查点加载的权重

我sft微调的模块是：["to_q","to_k","to_v","add_q_proj","add_k_proj","add_v_proj","to_out.0","to_add_out","img_mlp.net.2","img_mod.1","txt_mlp.net.2","txt_mod.1"]

目前加载peft形式的lora权重，它代码src/models/adapter.py的逻辑是要传入含有.json文件的上级目录。但是我们在实践过程中传这个目录会出现warning，传下一级精确到.safetensors文件的路径就不会出现warning（我对代码做了修改，支持直接load .safetensors）。但是呢，如果直接精确load .safetensors，并且把lora_alpha改成和lora_rank一致还是没解决这个问题。可能还是得加载peft格式，但现在这个warning有点迷，明明是同一个模型，都是2511.

0 replies

Jayce-Ping · 2026-01-18T06:42:38Z

Jayce-Ping
Jan 18, 2026
Maintainer

这里其实就是说明加载LoRA的时候权重没匹配上。eval_samples可能是未训练模型的表现，而且如果推理步数少的话，也会造成质量下降。

这个base_model.model是PeftModel的一个前缀，是因为Diffsync-Studio保存lora权重时用的key和我这里加载时不太一致。我会看看这个怎么修复。你可以先尝试把lora权重合并进去保存全参，然后加载全参试试。

0 replies

Jayce-Ping · 2026-01-18T13:38:35Z

Jayce-Ping
Jan 18, 2026
Maintainer

@Weistrass 我大概分析了一下，Diffsynth-Studio保存LoRA的格式和diffusers:DiffusionPipeline的save_lora_weights是一致的，都是只保存了一个safetensors文件，包含所有信息。Flow-Factory采用了PeftModel.save_pretrained得到一个adapter的文件夹，包含权重adapter_model.safetensors和adapter_config.json两个文件。具体来说，

diffusers格式（你在sft后得到的权重文件）格式如下：

transformer.layer1.lora_A.default.weight: ....
transformer.layer1.lora_B.default.weight: ....
transformer.layer2.lora_A.default.weight: ....
transformer.layer2.lora_B.default.weight: ....

格式为transformer.网络层.lora_A/B.适配器名.weight

相应的PeftModel的格式如下：

base_model.model.layer1.lora_A.weight: ....
base_model.model.layer1.lora_B.weight: ....
base_model.model.layer2.lora_A.weight: ....
base_model.model.layer2.lora_B.weight: ....

格式为base_model.model.网络层.lora_A/B.weight

如下是一个两种格式相互转换的脚本，找了几个例子自己测试了一下，你可以参考参考。总体来说Peft格式保存的信息更全一些，我后面也会在Flow-Factory保留这种格式。我还需要进一步研究一下DiffSynth-Studio和diffusers的源码，尽可能把他们保存的checkpoint也能兼容进来。你可以拿下面的diffusers_to_peft函数先试试能不能成功，基本上其实就是改一改权重文件中key的名称的问题。

import torch
import os
import json
from safetensors.torch import save_file, load_file
import torch
from peft import LoraConfig
import re


def peft_to_diffusers(peft_model_path, output_file, prefix="transformer"):
    """
    将 PEFT 格式转为 Diffusers 格式
    prefix: FLUX 模型通常使用 'transformer'，SDXL 使用 'unet' 或 'text_encoder'
    """
    peft_state_dict = load_file(os.path.join(peft_model_path, "adapter_model.safetensors"))
    new_state_dict = {}
    for k, v in peft_state_dict.items():
        # 1. 移除 PEFT 的固定前缀 base_model.model.
        new_key = k.replace("base_model.model.", "")
        new_key = new_key.replace(".lora_A.weight", ".lora_A.weight")
        new_key = new_key.replace(".lora_B.weight", ".lora_B.weight")
        # 2. 加上 Diffusers 加载器识别组件的前缀
        # 如果 new_key 已经有了前缀则不再重复添加
        if prefix and not new_key.startswith(f"{prefix}."):
            new_key = f"{prefix}.{new_key}"
            
        new_state_dict[new_key] = v

    save_file(new_state_dict, output_file)
    print(f"Successfully converted PEFT to Diffusers with prefix '{prefix}': {output_file}")


def diffusers_to_peft(diffusers_file, output_dir, prefix="transformer", target_modules=None, r=None):
    """
    将 Diffusers 单文件转为 PEFT 文件夹
    """
    os.makedirs(output_dir, exist_ok=True)
    diffusers_state_dict = load_file(diffusers_file)
    
    peft_state_dict = {}
    detected_r = 0
    
    for k, v in diffusers_state_dict.items():
        # 1. 移除 Diffusers 的组件前缀 (如 'transformer.')
        new_key = k
        if prefix and k.startswith(f"{prefix}."):
            new_key = k.replace(f"{prefix}.", "")
        
        # 2. 加上 PEFT 的固定前缀
        peft_key = f"base_model.model.{new_key}"
        peft_state_dict[peft_key] = v
        
        # 3. 自动检测 Rank (从 lora_A 的形状推断)
        if r is None and "lora_A.weight" in k:
            detected_r = v.shape[0]

    final_r = r if r is not None else (detected_r if detected_r > 0 else 64)

    save_file(peft_state_dict, os.path.join(output_dir, "adapter_model.safetensors"))
    
    # 写入 config
    config = {
        "peft_type": "LORA",
        "r": final_r,
        "lora_alpha": final_r * 2, # 常见的 alpha 设定
        "target_modules": target_modules or [],
        "lora_dropout": 0.0,
        "bias": "none",
        "inference_mode": True,
        "base_model_name_or_path": None
    }
    
    with open(os.path.join(output_dir, "adapter_config.json"), "w") as f:
        json.dump(config, f, indent=4)
        
    print(f"Successfully converted Diffusers to PEFT (Rank: {final_r})")

def diffusers_to_peft_auto(diffusers_file, output_dir, prefix=None):
    os.makedirs(output_dir, exist_ok=True)
    diffusers_state_dict = load_file(diffusers_file)
    first_key = list(diffusers_state_dict.keys())[0]
    # 尝试自动推断 prefix
    if prefix is None:
        if first_key.startswith("transformer."):
            prefix = "transformer"
        elif first_key.startswith("unet."):
            prefix = "unet"
        else:
            prefix = ""

    peft_state_dict = {}
    full_module_paths = set() # 存储完整路径
    detected_r = None
    
    # 匹配 transformer.xxx.lora_A.weight
    pattern = re.compile(r"^(?:" + prefix + r"\.)?(.*)\.lora_[AB](?:\.[^.]+)?\.weight$")
    
    for k, v in diffusers_state_dict.items():
        match = pattern.match(k)
        if match:
            module_full_path = match.group(1) # 例如 single_blocks.0.attn.to_out.0
            full_module_paths.add(module_full_path)
            
            if ".lora_A" in k and detected_r is None:
                detected_r = v.shape[0]

        # 转换为 PEFT 键名
        new_key = k.replace(f"{prefix}.", "") if prefix and k.startswith(f"{prefix}.") else k
        if ".lora_A.default.weight" in new_key:
            new_key = new_key.replace(".lora_A.default.weight", ".lora_A.weight")
        elif ".lora_B.default.weight" in new_key:
            new_key = new_key.replace(".lora_B.default.weight", ".lora_B.weight")
            
        peft_state_dict[f"base_model.model.{new_key}"] = v

    save_file(peft_state_dict, os.path.join(output_dir, "adapter_model.safetensors"))
    
    rank = detected_r if detected_r else 64
    config = {
        "peft_type": "LORA",
        "r": rank,
        "lora_alpha": rank, # or rank * 2
        # 关键修改：使用完整路径列表，避免模糊匹配到容器模块
        "target_modules": sorted(list(full_module_paths)),
        "lora_dropout": 0.0,
        "bias": "none",
        "inference_mode": True,
        "base_model_name_or_path": None,
        "init_lora_weights": True
    }
    
    with open(os.path.join(output_dir, "adapter_config.json"), "w") as f:
        json.dump(config, f, indent=4)
        
    print(f"✅ 转换成功！")
    print(f"精确匹配到 {len(config['target_modules'])} 个线性层路径")
    return config


def test_p2d(peft_lora_path, diffusers_output):
    # Peft -> diffusers
    peft_to_diffusers(
        peft_lora_path,
        diffusers_output,
    )
    pipeline.load_lora_weights(diffusers_output)
    print(list([k for k in pipeline.transformer.state_dict().keys() if 'lora' in k][:10]))

def test_d2p(pipeline, diffusers_output, peft_output):

    # # 2. Diffusers -> PEFT
    # target_modules = [......]
    # # 手动输入target modules，自动检测lora_rank
    # diffusers_to_peft(
    #     diffusers_file=diffusers_output,
    #     output_dir=peft_output,
    #     target_modules=target_modules,
    # )
    # 自动检测target modules & Lora rank
    diffusers_to_peft_auto(
        diffusers_file=diffusers_output,
        output_dir=peft_output,
    )

    # 然后用 PEFT 加载
    from peft import PeftModel
    transformer = PeftModel.from_pretrained(
        pipeline.transformer,
        peft_output
    )
    print(list([k for k in transformer.state_dict().keys() if 'lora' in k][:10]))


from diffusers import Flux2KleinPipeline
model = 'black-forest-labs/FLUX.2-klein-base-4B'
pipeline = Flux2KleinPipeline.from_pretrained(model, torch_dtype=torch.bfloat16)
peft_lora_path = 'flux2/checkpoint-0/'
diffusers_output = 'flux2/diff_checkpoint.safetensors'
peft_output = 'flux2/temp_peft_checkpoint' # Should be the same as peft_lora_path
def test1():
    test_p2d(peft_lora_path=peft_lora_path, diffusers_output=diffusers_output)

def test2():
    test_d2p(pipeline=pipeline, diffusers_output=diffusers_output, peft_output=peft_output)

from diffusers import QwenImageEditPlusPipeline
model = 'Qwen/Qwen-Image-Edit-2509'
pipeline = QwenImageEditPlusPipeline.from_pretrained(model, torch_dtype=torch.bfloat16)
diffusers_output = 'qwen/epoch-4.safetensors'
peft_output = 'qwen/temp_peft'
diffusers_output_2 = 'qwen/epoch-4_new.safetensors'
diff_weight = load_file(diffusers_output)
# print(list(diff_weight.keys())[:10])

def test3():
    test_d2p(pipeline=pipeline, diffusers_output=diffusers_output, peft_output=peft_output)
def test4():
    test_p2d(peft_lora_path=peft_lora_path, diffusers_output=diffusers_output_2)

0 replies

Weistrass · 2026-01-18T14:03:33Z

Weistrass
Jan 18, 2026
Author

十分感谢您提供的建议，我会立即尝试一下，祝您一切顺利！

0 replies

Weistrass · 2026-01-18T15:54:44Z

Weistrass
Jan 18, 2026
Author

十分感谢您提供的建议，我会立即尝试一下，祝您一切顺利！

您提供的转换代码有效，十分感谢，将diffuser格式的.safetensor转成peft版本后，模型在grpo训练前load进来，保持住了sft训练时的效果

0 replies

Weistrass · 2026-01-21T13:52:49Z

Weistrass
Jan 21, 2026
Author

您好，我在使用qwen-edit-2511模型时，发现https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/qwen_image/model_training/lora/Qwen-Image-Edit-2511.sh链接对应的训练脚本中提到相比于2509模型，2511模型有一个特殊的超参 --zero_cond_t # This is a special parameter introduced by Qwen-Image-Edit-2511. Please enable it for this model，请问您这边在使用2511模型时有考虑到这个超参数嘛，我暂时没有看见。如果需要我自己额外添加，我应该怎么添加才比较合适呢，感谢您。

0 replies

Jayce-Ping · 2026-01-21T14:26:40Z

Jayce-Ping
Jan 21, 2026
Maintainer

我写代码时是基于diffusers的QwenImageEditPlusPipeline。在diffusers中没有提供这个参数，我看了一下Diffsyth-Studio有这个参数。这个参数好像是控制一致性之类的，在timestep上做了一些特殊处理。你可以对应看看DiffSynth-Studio的推理代码，修改这边src/flow_factory/qwen/qwen_image_edit_plus.py中对应的inference和forward方法即可。不过我感觉RL和这个可能是正交的，你可以先试试RL后的模型跑一下带这个参数/和不带这个参数的推理看看效果差异。

0 replies

Weistrass · 2026-01-23T03:13:15Z

Weistrass
Jan 23, 2026
Author

我写代码时是基于diffusers的QwenImageEditPlusPipeline。在diffusers中没有提供这个参数，我看了一下Diffsyth-Studio有这个参数。这个参数好像是控制一致性之类的，在timestep上做了一些特殊处理。你可以对应看看DiffSynth-Studio的推理代码，修改这边src/flow_factory/qwen/qwen_image_edit_plus.py中对应的inference和forward方法即可。不过我感觉RL和这个可能是正交的，你可以先试试RL后的模型跑一下带这个参数/和不带这个参数的推理看看效果差异。

好的，感谢，不过我最近发现虽然我把DiffSynth训练的关于sft lora的权重通过您之前提供的转换代码成功转成peft版本后，在flow-factory加载时，跑出来的结果仍然会有差异（合并权重时没有报warning警告了，并且在flow-factory中成功显示load进Lora权重），但可能由于您使用的时diffuser默认的推理框架，而DiffSynth则是自己搭建的推理框架：https://github.com/modelscope/DiffSynth-Studio/blob/main/diffsynth/pipelines/qwen_image.py

那么DiffSynth这边的权重训练好后，成功转换成对应的peft版本，flow-factory虽然能够成功加载，但也可能因为推理流程的不同，导致结果出现差异。
所以我想请教一下flow-factory这个框架下是否支持sft版本的lora微调，这样能确保推理路径是统一的。
这个问题我被卡了比较久，因为flow-factory这边训好的结果，转完权重到DiffSynth-Studio的框架下用之前写的测评代码测评，发现结果不一致，一路追根溯源，感觉可能是推理框架的原因？希望能得到您的一些建议（感谢大佬），或者如果我直接上强化学习，并在您的框架下写个推理测评代码，我不太确定能否超越sft+rl的范式

0 replies

Jayce-Ping · 2026-01-23T03:26:04Z

Jayce-Ping
Jan 23, 2026
Maintainer

最新的代码框架是支持加载单个safetensors文件的，你可以试试直接加载sft之后的safetensor，指定resume_path=/path/to/dir/，其中/path/to/dir是包含单个safetensors文件的文件夹。

但也可能因为推理流程的不同，导致结果出现差异。

这个是diffusers pipeline和DiffSynth-Studio在推理中的略微差异了，比如你遇到的qwen-image-edit-2511引入新的参数zero_cond_t. 如果要兼容DiffSynth-Studio的推理流程，我认为是不太可能的，需要完全重写几乎所有推理代码。实际上，我认为这个推理的微小差异不会造成太显著的结果上的差异，因为模型的权重是一致的。在SFT+RL之后的模型，按理来说按照两种框架的代码都可以进行推理。如果你的RL训练能正常涨性能的话，在diffusers框架下加载权重应该会得到相同的性能，因为flow-factory做evaluation时就用的是标准的diffusers pipeline.

因为flow-factory这边训好的结果，转完权重到DiffSynth-Studio的框架下用之前写的测评代码测评，发现结果不一致，一路追根溯源，感觉可能是推理框架的原因？

模型加载权重如果没问题的话，结果不应该有很大的不一致。可以把相同种子相同输入的对比结果放到这里看看嘛？不知道数据是否方便公开。

或者如果我直接上强化学习，并在您的框架下写个推理测评代码，我不太确定能否超越sft+rl的范式

如果基模本身具有一定的能力是可以直接上RL的。SFT的作用主要就是激活基模本身没有的能力。关于flow-factory的推理代码，我会更新一版上来供大家参考。

0 replies

Weistrass · 2026-01-23T03:39:18Z

Weistrass
Jan 23, 2026
Author

好的，感谢您的建议，我会更新一下代码，直接加载sft之后的单个safetensors文件，并看看相同的种子，同样的输入的对比结果（按照您的说法，这两个框架推理差异不应该很大）如果还有问题，我会把测试的对比结果和您分享一下。感谢您的高效回复！🥰

0 replies

Weistrass · 2026-01-23T11:28:54Z

Weistrass
Jan 23, 2026
Author

好的，感谢您的建议，我会更新一下代码，直接加载sft之后的单个safetensors文件，并看看相同的种子，同样的输入的对比结果（按照您的说法，这两个框架推理差异不应该很大）如果还有问题，我会把测试的对比结果和您分享一下。感谢您的高效回复！🥰

您好，我这边测试了一下，还是有些结果不太一样，给您发了google邮件，文档里面有对应的cases可以参考一下，如果需要，我可以把我写的在flow-factory下的推理代码也发给您，我想知道这种差异是否是正常情况🤔。感谢！

0 replies

Discussion about GRPO training setup for Qwen-Image-Edit-2509 or 2511 #52

Uh oh!

Weistrass Jan 14, 2026

Multi-reference image training

Configuration considerations

Adding custom reward functions:

Replies: 33 comments

Uh oh!

Jayce-Ping Jan 14, 2026 Maintainer

Multi-reference image training

Configuration considerations

Adding custom reward functions:

An example configuration file

Uh oh!

Weistrass Jan 15, 2026 Author

Uh oh!

Uh oh!

Jayce-Ping Jan 15, 2026 Maintainer

Uh oh!

Uh oh!

Weistrass Jan 15, 2026 Author

Uh oh!

Uh oh!

Jayce-Ping Jan 15, 2026 Maintainer

Uh oh!

Weistrass Jan 15, 2026 Author

Uh oh!

Jayce-Ping Jan 16, 2026 Maintainer

Uh oh!

Weistrass Jan 16, 2026 Author

Uh oh!

Weistrass Jan 16, 2026 Author

Uh oh!

Jayce-Ping Jan 16, 2026 Maintainer

Uh oh!

Jayce-Ping Jan 16, 2026 Maintainer

Uh oh!

Weistrass Jan 16, 2026 Author

Uh oh!

Weistrass Jan 16, 2026 Author

Uh oh!

Weistrass Jan 17, 2026 Author

Uh oh!

Jayce-Ping Jan 17, 2026 Maintainer

Uh oh!

Uh oh!

Weistrass Jan 17, 2026 Author

Uh oh!

Uh oh!

Jayce-Ping Jan 17, 2026 Maintainer

Uh oh!

Weistrass Jan 18, 2026 Author

Uh oh!

Uh oh!

Jayce-Ping Jan 18, 2026 Maintainer

Uh oh!

Uh oh!

Weistrass Jan 18, 2026 Author

Uh oh!

Uh oh!

Jayce-Ping Jan 18, 2026 Maintainer

Uh oh!

Uh oh!

Jayce-Ping Jan 18, 2026 Maintainer

Uh oh!

Weistrass Jan 18, 2026 Author

Uh oh!

Weistrass Jan 18, 2026 Author

Weistrass
Jan 14, 2026

Jayce-Ping
Jan 14, 2026
Maintainer

Weistrass
Jan 15, 2026
Author

Jayce-Ping
Jan 15, 2026
Maintainer

Weistrass
Jan 15, 2026
Author

Jayce-Ping
Jan 15, 2026
Maintainer

Weistrass
Jan 15, 2026
Author

Jayce-Ping
Jan 16, 2026
Maintainer

Weistrass
Jan 16, 2026
Author

Weistrass
Jan 16, 2026
Author

Jayce-Ping
Jan 16, 2026
Maintainer

Jayce-Ping
Jan 16, 2026
Maintainer

Weistrass
Jan 16, 2026
Author

Weistrass
Jan 16, 2026
Author

Weistrass
Jan 17, 2026
Author

Jayce-Ping
Jan 17, 2026
Maintainer

Weistrass
Jan 17, 2026
Author

Jayce-Ping
Jan 17, 2026
Maintainer

Weistrass
Jan 18, 2026
Author

Jayce-Ping
Jan 18, 2026
Maintainer

Weistrass
Jan 18, 2026
Author

Jayce-Ping
Jan 18, 2026
Maintainer

Jayce-Ping
Jan 18, 2026
Maintainer

Weistrass
Jan 18, 2026
Author

Weistrass
Jan 18, 2026
Author