The evaluation results in Stage 2 show a large discrepancy (the learning rate has been modified).

Since I only have one 3090 GPU, I set the learning rate very low in Stage 1 (the original learning rate would cause NaN issues)：
optimizer = dict(
    type="AdamW",           
    lr=5e-5,                
    weight_decay=0.001,     
    paramwise_cfg=dict(     
        custom_keys={
            "img_backbone": dict(lr_mult=0.8),   
        }
    ),
)
optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))   
lr_config = dict(                       
    policy="CosineAnnealing",           
    warmup="linear",                    
    warmup_iters=500,                  
    warmup_ratio=1.0 / 3,              
    min_lr_ratio=1e-4,                 
)

Configuration in Stage 2:
optimizer = dict(
    type="AdamW",
    # lr=3e-4,
    lr=3e-4 / 2,
    weight_decay=0.001,
    paramwise_cfg=dict(
        custom_keys={
            "img_backbone": dict(lr_mult=0.1),
        }
    ),
)
optimizer_config = dict(grad_clip=dict(max_norm=25, norm_type=2))
lr_config = dict(
    policy="CosineAnnealing",
    warmup="linear",
    warmup_iters=500,
    warmup_ratio=1.0 / 3,
    min_lr_ratio=1e-3,
)

The results I reproduced have a significant difference.

<img width="583" height="642" alt="Image" src="https://github.com/user-attachments/assets/7877ce13-72ec-4933-ae7a-829e00f038ab" />

<img width="1634" height="471" alt="Image" src="https://github.com/user-attachments/assets/73b400dc-1b7b-4293-bfcb-ced9f76bf6a5" />

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The evaluation results in Stage 2 show a large discrepancy (the learning rate has been modified). #126

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The evaluation results in Stage 2 show a large discrepancy (the learning rate has been modified). #126

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions