TY - JOUR
T1 - Real-IAD D3
T2 - 2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2025
AU - Zhu, Wenbing
AU - Wang, Lidong
AU - Zhou, Ziqing
AU - Wang, Chengjie
AU - Pan, Yurui
AU - Zhang, Ruoyi
AU - Chen, Zhuhao
AU - Cheng, Linjie
AU - Gao, Bin Bin
AU - Zhang, Jiangning
AU - Gan, Zhenye
AU - Wang, Yuxie
AU - Chen, Yulong
AU - Qian, Shuguang
AU - Chi, Mingmin
AU - Peng, Bo
AU - Ma, Lizhuang
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - The increasing complexity of industrial anomaly detection (IAD) has positioned multimodal detection methods as a focal area of machine vision research. However, dedicated multimodal datasets specifically tailored for IAD remain limited. Pioneering datasets like MVTec 3D have laid essential groundwork in multimodal IAD by incorporating RGB+3D data, but still face challenges in bridging the gap with real industrial environments due to limitations in scale and resolution. To address these challenges, we introduce Real-IAD D3, a high-precision multimodal dataset that uniquely incorporates an additional pseudo-3D modality generated through photometric stereo, alongside high-resolution RGB images and micrometer-level 3D point clouds. Real-IAD D3 features finer defects, diverse anomalies, and greater scale across 20 categories, providing a challenging benchmark for multimodal IAD Additionally, we introduce an effective approach that integrates RGB, point cloud, and pseudo-3D depth information to leverage the complementary strengths of each modality, enhancing detection performance. Our experiments highlight the importance of these modalities in boosting detection robustness and overall IAD performance. The dataset and code are publicly accessible for research purposes at https://realiad4ad.github.io/Real-IAD_D3.
AB - The increasing complexity of industrial anomaly detection (IAD) has positioned multimodal detection methods as a focal area of machine vision research. However, dedicated multimodal datasets specifically tailored for IAD remain limited. Pioneering datasets like MVTec 3D have laid essential groundwork in multimodal IAD by incorporating RGB+3D data, but still face challenges in bridging the gap with real industrial environments due to limitations in scale and resolution. To address these challenges, we introduce Real-IAD D3, a high-precision multimodal dataset that uniquely incorporates an additional pseudo-3D modality generated through photometric stereo, alongside high-resolution RGB images and micrometer-level 3D point clouds. Real-IAD D3 features finer defects, diverse anomalies, and greater scale across 20 categories, providing a challenging benchmark for multimodal IAD Additionally, we introduce an effective approach that integrates RGB, point cloud, and pseudo-3D depth information to leverage the complementary strengths of each modality, enhancing detection performance. Our experiments highlight the importance of these modalities in boosting detection robustness and overall IAD performance. The dataset and code are publicly accessible for research purposes at https://realiad4ad.github.io/Real-IAD_D3.
UR - https://www.scopus.com/pages/publications/105017072988
U2 - 10.1109/CVPR52734.2025.01417
DO - 10.1109/CVPR52734.2025.01417
M3 - 会议文章
AN - SCOPUS:105017072988
SN - 1063-6919
SP - 15214
EP - 15223
JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Y2 - 11 June 2025 through 15 June 2025
ER -