TY - JOUR
T1 - Adaptive Effective Receptive Field Convolution for Semantic Segmentation of VHR Remote Sensing Images
AU - Chen, Xi
AU - Li, Zhiqiang
AU - Jiang, Jie
AU - Han, Zhen
AU - Deng, Shiyi
AU - Li, Zhihong
AU - Fang, Tao
AU - Huo, Hong
AU - Li, Qingli
AU - Liu, Min
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2021/4
Y1 - 2021/4
N2 - Convolutional neural networks (CNNs) have facilitated impressive improvements in the semantic segmentation of very high-resolution (VHR) remote sensing images. The success of semantic segmentation depends on an effective receptive field (RF) large enough to cover the entire object. Popular methods to enlarge the effective RF include dilated filters, subsampling operations, and stacking layers. Unfortunately, the methods are inefficient or able to cause grid artifacts. Moreover, although the object sizes vary greatly in remote sensing images, the size of the RF cannot reach a compromise between small and large objects. To tackle these problems, we propose adaptive effective receptive convolution (AERFC) for VHR remote sensing images. AERFC adaptively controls the sampling location of convolution and automatically adjusts the effective RF without significantly increasing the parameter number and computational cost. Thus, AERFC reduces the training difficulty, decreases overfitting risk, and reserves details in VHR images. AERFC is also integrated with spatial pyramid pooling (SPP) to aggregate diverse multiscale features for exploring contextual information. Experimental results of the quantitative and qualitative evaluation over four benchmark data sets show that AERFC outperforms state-of-the-art methods.
AB - Convolutional neural networks (CNNs) have facilitated impressive improvements in the semantic segmentation of very high-resolution (VHR) remote sensing images. The success of semantic segmentation depends on an effective receptive field (RF) large enough to cover the entire object. Popular methods to enlarge the effective RF include dilated filters, subsampling operations, and stacking layers. Unfortunately, the methods are inefficient or able to cause grid artifacts. Moreover, although the object sizes vary greatly in remote sensing images, the size of the RF cannot reach a compromise between small and large objects. To tackle these problems, we propose adaptive effective receptive convolution (AERFC) for VHR remote sensing images. AERFC adaptively controls the sampling location of convolution and automatically adjusts the effective RF without significantly increasing the parameter number and computational cost. Thus, AERFC reduces the training difficulty, decreases overfitting risk, and reserves details in VHR images. AERFC is also integrated with spatial pyramid pooling (SPP) to aggregate diverse multiscale features for exploring contextual information. Experimental results of the quantitative and qualitative evaluation over four benchmark data sets show that AERFC outperforms state-of-the-art methods.
KW - Field of view
KW - filter
KW - kernel
KW - semantic contextual information
UR - https://www.scopus.com/pages/publications/85103336121
U2 - 10.1109/TGRS.2020.3009143
DO - 10.1109/TGRS.2020.3009143
M3 - 文章
AN - SCOPUS:85103336121
SN - 0196-2892
VL - 59
SP - 3532
EP - 3546
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
IS - 4
M1 - 9147012
ER -