FocusPatch AD: Few-Shot Multi-Class Anomaly Detection with Unified Keywords Patch Prompts

  • Xicheng Ding
  • , Xiaofan Li
  • , Mingang Chen
  • , Jingyu Gong*
  • , Yuan Xie
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Industrial few-shot anomaly detection (FSAD) requires identifying various abnormal states by leveraging as few normal samples as possible (abnormal samples are unavailable during training). However, current methods often require training a separate model for each category, leading to increased computation and storage overhead. Thus, designing a unified anomaly detection model that supports multiple categories remains a challenging task, as such a model must recognize anomalous patterns across diverse objects and domains. To tackle these challenges, this paper introduces FocusPatch AD, a unified anomaly detection framework based on vision-language models, achieving anomaly detection under few-shot multi-class settings. FocusPatch AD links anomaly state keywords to highly relevant discrete local regions within the image, guiding the model to focus on cross-category anomalies while filtering out background interference. This approach mitigates the false detection issues caused by global semantic alignment in vision-language models. We evaluate the proposed method on the MVTec, VisA, and Real-IAD datasets, comparing them against several prevailing anomaly detection methods. In both image-level and pixel-level anomaly detection tasks, FocusPatch AD achieves significant gains in classification and localization performance, demonstrating excellent generalization and adaptability.

Original languageEnglish
JournalIEEE Transactions on Image Processing
DOIs
StateAccepted/In press - 2026

Keywords

  • Anomaly detection
  • few-shot learning
  • unified model
  • vision-language models

Fingerprint

Dive into the research topics of 'FocusPatch AD: Few-Shot Multi-Class Anomaly Detection with Unified Keywords Patch Prompts'. Together they form a unique fingerprint.

Cite this