Abstract
Semantic segmentation is a fundamental task in remote sensing image processing. It provides pixel-level classification, which is important for many applications, such as building extraction and land use mapping. The development of convolutional neural network has considerably improved the performance of semantic segmentation. Most semantic segmentation networks are the encoder-decoder structure. Bilinear interpolation is an ordinary upsampling method in the decoder, but bilinear interpolation only considers its own features and inserts three times its own features. This over-simple and data-independent bilinear upsampling may lead to suboptimal results. In this work, we propose an upsampling method based on local relations to replace bilinear interpolation. Upsampling is performed by correlating the local relationship of feature maps of adjacent stages, which can better integrate local and global information. We also design a fusion module based on local similarity. Our proposed method with ResNet101 as the backbone of the segmentation network can improve the average F-{1} score and overall accuracy of the Vaihingen data set by 2.69% and 1.31%, respectively. Our proposed method also has fewer parameters and less inference time.
| Original language | English |
|---|---|
| Journal | IEEE Geoscience and Remote Sensing Letters |
| Volume | 19 |
| DOIs | |
| State | Published - 2022 |
Keywords
- Decoder
- local relationship upsampling
- remote sensing
- semantic segmentation