Res2Net for object detection and instance segmentation
Introduction
We propose a novel building block for CNNs, namely Res2Net, by constructing hierarchical residual-like connections within one single residual block. The Res2Net represents multi-scale features at a granular level and increases the range of receptive fields for each network layer.
Backbone |
Params. |
GFLOPs |
top-1 err. |
top-5 err. |
ResNet-101 |
44.6 M |
7.8 |
22.63 |
6.44 |
ResNeXt-101-64x4d |
83.5M |
15.5 |
20.40 |
- |
HRNetV2p-W48 |
77.5M |
16.1 |
20.70 |
5.50 |
Res2Net-101 |
45.2M |
8.3 |
18.77 |
4.64 |
Compared with other backbone networks, Res2Net requires fewer parameters and FLOPs.
Note:
- GFLOPs for classification are calculated with image size (224x224).
@article{gao2019res2net,
title={Res2Net: A New Multi-scale Backbone Architecture},
author={Gao, Shang-Hua and Cheng, Ming-Ming and Zhao, Kai and Zhang, Xin-Yu and Yang, Ming-Hsuan and Torr, Philip},
journal={IEEE TPAMI},
year={2020},
doi={10.1109/TPAMI.2019.2938758},
}
Results and Models
Faster R-CNN
Backbone |
Style |
Lr schd |
Mem (GB) |
Inf time (fps) |
box AP |
Config |
Download |
R2-101-FPN |
pytorch |
2x |
7.4 |
- |
43.0 |
config |
model | log |
Mask R-CNN
Backbone |
Style |
Lr schd |
Mem (GB) |
Inf time (fps) |
box AP |
mask AP |
Config |
Download |
R2-101-FPN |
pytorch |
2x |
7.9 |
- |
43.6 |
38.7 |
config |
model | log |
Cascade R-CNN
Backbone |
Style |
Lr schd |
Mem (GB) |
Inf time (fps) |
box AP |
Config |
Download |
R2-101-FPN |
pytorch |
20e |
7.8 |
- |
45.7 |
config |
model | log |
Cascade Mask R-CNN
Backbone |
Style |
Lr schd |
Mem (GB) |
Inf time (fps) |
box AP |
mask AP |
Config |
Download |
R2-101-FPN |
pytorch |
20e |
9.5 |
- |
46.4 |
40.0 |
config |
model | log |
Hybrid Task Cascade (HTC)
Backbone |
Style |
Lr schd |
Mem (GB) |
Inf time (fps) |
box AP |
mask AP |
Config |
Download |
R2-101-FPN |
pytorch |
20e |
- |
- |
47.5 |
41.6 |
config |
model | log |