TY - GEN
T1 - RAGT
T2 - 18th International Conference on Computer-Aided Design and Computer Graphics, CAD/Graphics 2023
AU - Li, Ziqing
AU - Li, Yang
AU - Lin, Shaohui
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - 3D human pose and shape estimation from monocular images is a fundamental task in computer vision, but it is highly ill-posed and challenging due to occlusion. Occlusion can be caused by other objects that block parts of the body from being visible in the image. When an occlusion occurs, the image features become incomplete and ambiguous, leading to inaccurate or even wrong predictions. In this paper, we propose a novel method, named RAGT, that can handle occlusion robustly and recover the complete 3D pose and shape of humans. Our study focuses on achieving robust feature representation for human pose and shape estimation in the presence of occlusion. To this end, we introduce a dual-branch architecture that learns incorporation weights from visible parts to occluded parts and suppression weights to inhibit the integration of background features. To further improve the quality of visible and occluded maps, we leverage pseudo ground-truth maps generated by DensePose for pixel-level supervision. Additionally, we propose a novel transformer-based module called COAT (Contextual Occlusion-Aware Transformer) to effectively incorporate visible features into occluded regions. The COAT module is guided by an Occlusion-Guided Attention Loss (OGAL). OGAL is designed to explicitly encourage the COAT module to fuse more important and relevant features that are semantically and spatially closer to the occluded regions. We conduct experiments on various benchmarks and prove the robustness of RAGT to the different kinds of occluded scenes both quantitatively and qualitatively.
AB - 3D human pose and shape estimation from monocular images is a fundamental task in computer vision, but it is highly ill-posed and challenging due to occlusion. Occlusion can be caused by other objects that block parts of the body from being visible in the image. When an occlusion occurs, the image features become incomplete and ambiguous, leading to inaccurate or even wrong predictions. In this paper, we propose a novel method, named RAGT, that can handle occlusion robustly and recover the complete 3D pose and shape of humans. Our study focuses on achieving robust feature representation for human pose and shape estimation in the presence of occlusion. To this end, we introduce a dual-branch architecture that learns incorporation weights from visible parts to occluded parts and suppression weights to inhibit the integration of background features. To further improve the quality of visible and occluded maps, we leverage pseudo ground-truth maps generated by DensePose for pixel-level supervision. Additionally, we propose a novel transformer-based module called COAT (Contextual Occlusion-Aware Transformer) to effectively incorporate visible features into occluded regions. The COAT module is guided by an Occlusion-Guided Attention Loss (OGAL). OGAL is designed to explicitly encourage the COAT module to fuse more important and relevant features that are semantically and spatially closer to the occluded regions. We conduct experiments on various benchmarks and prove the robustness of RAGT to the different kinds of occluded scenes both quantitatively and qualitatively.
KW - Human Pose and Shape Estimation
KW - Human Reconstruction
KW - Transformer
UR - https://www.scopus.com/pages/publications/85185828971
U2 - 10.1007/978-981-99-9666-7_22
DO - 10.1007/978-981-99-9666-7_22
M3 - 会议稿件
AN - SCOPUS:85185828971
SN - 9789819996650
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 329
EP - 347
BT - Computer-Aided Design and Computer Graphics - 18th International Conference, CAD/Graphics 2023, Proceedings
A2 - Hu, Shi-Min
A2 - Cai, Yiyu
A2 - Rosin, Paul
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 19 August 2023 through 21 August 2023
ER -