FSAF is an anchor-free method published in CVPR2019 (https://arxiv.org/pdf/1903.00621.pdf).
Actually it is equivalent to the anchor-based method with only one anchor at each feature map position in each FPN level.
And this is how we implemented it.
Only the anchor-free branch is released for its better compatibility with the current framework and less computational budget.
In the original paper, feature maps within the central 0.2-0.5 area of a gt box are tagged as ignored. However,
it is empirically found that a hard threshold (0.2-0.2) gives a further gain on the performance. (see the table below)
Backbone | ignore range | ms-train | Lr schd | Train Mem (GB) | Train time (s/iter) | Inf time (fps) | box AP | Config | Download |
---|---|---|---|---|---|---|---|---|---|
R-50 | 0.2-0.5 | N | 1x | 3.15 | 0.43 | 12.3 | 36.0 (35.9) | model | log | |
R-50 | 0.2-0.2 | N | 1x | 3.15 | 0.43 | 13.0 | 37.4 | config | model | log |
R-101 | 0.2-0.2 | N | 1x | 5.08 | 0.58 | 10.8 | 39.3 (37.9) | config | model | log |
X-101 | 0.2-0.2 | N | 1x | 9.38 | 1.23 | 5.6 | 42.4 (41.0) | config | model | log |
Notes:
BibTeX reference is as follows.
@inproceedings{zhu2019feature,
title={Feature Selective Anchor-Free Module for Single-Shot Object Detection},
author={Zhu, Chenchen and He, Yihui and Savvides, Marios},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={840--849},
year={2019}
}