TY - JOUR
T1 - FocusPatch AD
T2 - Few-Shot Multi-Class Anomaly Detection with Unified Keywords Patch Prompts
AU - Ding, Xicheng
AU - Li, Xiaofan
AU - Chen, Mingang
AU - Gong, Jingyu
AU - Xie, Yuan
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2026
Y1 - 2026
N2 - Industrial few-shot anomaly detection (FSAD) requires identifying various abnormal states by leveraging as few normal samples as possible (abnormal samples are unavailable during training). However, current methods often require training a separate model for each category, leading to increased computation and storage overhead. Thus, designing a unified anomaly detection model that supports multiple categories remains a challenging task, as such a model must recognize anomalous patterns across diverse objects and domains. To tackle these challenges, this paper introduces FocusPatch AD, a unified anomaly detection framework based on vision-language models, achieving anomaly detection under few-shot multi-class settings. FocusPatch AD links anomaly state keywords to highly relevant discrete local regions within the image, guiding the model to focus on cross-category anomalies while filtering out background interference. This approach mitigates the false detection issues caused by global semantic alignment in vision-language models. We evaluate the proposed method on the MVTec, VisA, and Real-IAD datasets, comparing them against several prevailing anomaly detection methods. In both image-level and pixel-level anomaly detection tasks, FocusPatch AD achieves significant gains in classification and localization performance, demonstrating excellent generalization and adaptability.
AB - Industrial few-shot anomaly detection (FSAD) requires identifying various abnormal states by leveraging as few normal samples as possible (abnormal samples are unavailable during training). However, current methods often require training a separate model for each category, leading to increased computation and storage overhead. Thus, designing a unified anomaly detection model that supports multiple categories remains a challenging task, as such a model must recognize anomalous patterns across diverse objects and domains. To tackle these challenges, this paper introduces FocusPatch AD, a unified anomaly detection framework based on vision-language models, achieving anomaly detection under few-shot multi-class settings. FocusPatch AD links anomaly state keywords to highly relevant discrete local regions within the image, guiding the model to focus on cross-category anomalies while filtering out background interference. This approach mitigates the false detection issues caused by global semantic alignment in vision-language models. We evaluate the proposed method on the MVTec, VisA, and Real-IAD datasets, comparing them against several prevailing anomaly detection methods. In both image-level and pixel-level anomaly detection tasks, FocusPatch AD achieves significant gains in classification and localization performance, demonstrating excellent generalization and adaptability.
KW - Anomaly detection
KW - few-shot learning
KW - unified model
KW - vision-language models
UR - https://www.scopus.com/pages/publications/105026295852
U2 - 10.1109/TIP.2025.3646861
DO - 10.1109/TIP.2025.3646861
M3 - 文章
AN - SCOPUS:105026295852
SN - 1057-7149
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
ER -