TY - JOUR
T1 - Paying attention for adjacent areas
T2 - Learning discriminative features for large-scale 3D scene segmentation
AU - Li, Mengtian
AU - Xie, Yuan
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2022
PY - 2022/9
Y1 - 2022/9
N2 - Despite recent improvements in analyzing large-scale 3D point clouds, several problems still exist: (a) segmentation models suffer from intra-class inconsistency and inter-class indistinction; (b) the existing methods ignore the inherent long-tailed class distribution of real-world 3D data. These problems result in unsatisfactory semantic segmentation predictions, especially in object adjacent areas. To handle these problems, this paper proposes a novel Adjacent areas Refinement Network (ARNet). Specifically, an Adjacent areas Refinement (AR) module is designed, which consists of two parallel attention blocks. Besides, our proposed attention blocks can process a large number of points (N∼105) with a slight increase in the computational complexity and time cost. Additionally, to deal with the inherent long-tailed class distribution in real-world 3D data, imbalance adjustment loss and occupancy regression loss are introduced. Based on this, the proposed network can handle the classification of both majority and minority classes, which is essential in distinguishing the ambiguous parts in large-scale 3D scenes. The proposed AR module and the loss functions can be easily integrated into the cutting-edge backbone networks, contributing to better performance in modeling semantic inter-dependencies and significantly improving the accuracy of the state-of-the-art semantic segmentation methods on indoor and outdoor scenes.
AB - Despite recent improvements in analyzing large-scale 3D point clouds, several problems still exist: (a) segmentation models suffer from intra-class inconsistency and inter-class indistinction; (b) the existing methods ignore the inherent long-tailed class distribution of real-world 3D data. These problems result in unsatisfactory semantic segmentation predictions, especially in object adjacent areas. To handle these problems, this paper proposes a novel Adjacent areas Refinement Network (ARNet). Specifically, an Adjacent areas Refinement (AR) module is designed, which consists of two parallel attention blocks. Besides, our proposed attention blocks can process a large number of points (N∼105) with a slight increase in the computational complexity and time cost. Additionally, to deal with the inherent long-tailed class distribution in real-world 3D data, imbalance adjustment loss and occupancy regression loss are introduced. Based on this, the proposed network can handle the classification of both majority and minority classes, which is essential in distinguishing the ambiguous parts in large-scale 3D scenes. The proposed AR module and the loss functions can be easily integrated into the cutting-edge backbone networks, contributing to better performance in modeling semantic inter-dependencies and significantly improving the accuracy of the state-of-the-art semantic segmentation methods on indoor and outdoor scenes.
KW - Attention
KW - Large-scale 3D point clouds
KW - Long-tailed distribution
KW - Segmentation
UR - https://www.scopus.com/pages/publications/85129173395
U2 - 10.1016/j.patcog.2022.108722
DO - 10.1016/j.patcog.2022.108722
M3 - 文章
AN - SCOPUS:85129173395
SN - 0031-3203
VL - 129
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 108722
ER -