I looked at the code and found that model_max_length is 32768 during training and 4096 during testing. Does model_max_length have a big impact on the results? Is there any ablation experiment?
Due to insufficient video memory, I need to add Lora for training. I only set mm_tunable_parts "mm_lora_layer" . Could you give me some advice?