Skip to content

About low training speed with fewer gpus. #6

@wenhe-jia

Description

@wenhe-jia

I tried to train Grid R-CNN with grid_rcnn_r50_fpn_2x.py, I used 4 gpus, setting lr=0.01 and warmup_iters=14660(2 epochs) 。The training goes well for now except that the training speed is very slow, about 2.2s/iter.

2019-08-12 23:29:06,094 - INFO - Epoch [2][4350/14659]  lr: 0.01000, eta: 8 days, 9:38:37, time: 2.414, data_time: 0.129, memory: 4836, loss_rpn_cls: 0.0598, loss_rpn_reg: 0.0411, loss_cls: 0.3345, acc: 89.5898, loss_grid: 0.6875, loss: 1.1229
2019-08-12 23:31:04,507 - INFO - Epoch [2][4400/14659]  lr: 0.01000, eta: 8 days, 9:41:07, time: 2.369, data_time: 0.134, memory: 4836, loss_rpn_cls: 0.0570, loss_rpn_reg: 0.0407, loss_cls: 0.3127, acc: 90.7246, loss_grid: 0.6755, loss: 1.0860
2019-08-12 23:32:52,252 - INFO - Epoch [2][4450/14659]  lr: 0.01000, eta: 8 days, 9:40:22, time: 2.155, data_time: 0.120, memory: 4836, loss_rpn_cls: 0.0632, loss_rpn_reg: 0.0409, loss_cls: 0.3332, acc: 90.3809, loss_grid: 0.6859, loss: 1.1231

My virtual environment is based on Anaconda 4.7, other system settings are as below:
Ubuntu 14.04
Pytorch 1.1
Python 3.7
CUDA 10.0
GCC 5.4
Need some help, thx.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions