TY - JOUR
T1 - HandNet
T2 - Occlusion-robust 3D hand mesh reconstruction with prior information
AU - Li, Jiawen
AU - Jiang, Fei
AU - Zhu, Dandan
AU - Zhou, Aimin
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/12/15
Y1 - 2025/12/15
N2 - 3D hand mesh reconstruction from a single RGB image is crucial for numerous applications yet challenging due to extensive occlusions. Interestingly, humans can infer plausible 3D hand shapes even under heavy occlusion by reasoning about full hand structures based on prior anatomical knowledge and contextual cues. Inspired by this cognitive process, we propose HandNet, a novel framework for 3D hand mesh reconstruction that explicitly utilizes both hand anatomy and contextual information to infer occluded structures. First, we introduce a dynamic relation modeling module that employs a graph-based representation of hand anatomy, capturing local skeletal topology and global contextual dependencies under anatomical constraints and adaptive correlations. Second, we design a cross-representation integration module that enables deep interaction between visual cues and structural priors, aligning shared features and promoting consistent hand representations. Extensive experiments on DexYCB, HO3D v2, and HO3D v3 datasets which contain challenging hand-object occlusions, demonstrating that our HandNet achieves state-of-the-art performance.
AB - 3D hand mesh reconstruction from a single RGB image is crucial for numerous applications yet challenging due to extensive occlusions. Interestingly, humans can infer plausible 3D hand shapes even under heavy occlusion by reasoning about full hand structures based on prior anatomical knowledge and contextual cues. Inspired by this cognitive process, we propose HandNet, a novel framework for 3D hand mesh reconstruction that explicitly utilizes both hand anatomy and contextual information to infer occluded structures. First, we introduce a dynamic relation modeling module that employs a graph-based representation of hand anatomy, capturing local skeletal topology and global contextual dependencies under anatomical constraints and adaptive correlations. Second, we design a cross-representation integration module that enables deep interaction between visual cues and structural priors, aligning shared features and promoting consistent hand representations. Extensive experiments on DexYCB, HO3D v2, and HO3D v3 datasets which contain challenging hand-object occlusions, demonstrating that our HandNet achieves state-of-the-art performance.
KW - 3D hand mesh reconstruction
KW - Cross-modal feature integration
KW - Prior guided learning
UR - https://www.scopus.com/pages/publications/105022237691
U2 - 10.1016/j.knosys.2025.114868
DO - 10.1016/j.knosys.2025.114868
M3 - 文章
AN - SCOPUS:105022237691
SN - 0950-7051
VL - 332
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 114868
ER -