READ: Robust and Efficient Anomaly Detection under Data Contamination and Limited Supervision

  • Hongzhe Shou
  • , Guanyu Lu
  • , Martin Pavlovski
  • , Fang Zhou*
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Existing anomaly detection methods tend to utilize a large amount of training data to learn patterns of normal data for effective anomaly identification, but such methods typically incur substantial training time overhead. Considering that unlabeled data often contains a lot of redundant information, selecting and utilizing a small yet representative subset instead of the entire dataset can significantly improve training efficiency while maintaining detection performance. To this end, we introduce an end-to-end reinforcement learning framework with a balanced sampling strategy that targets both normal and abnormal instances. This framework identifies and exploits potential anomalies in the unlabeled data while sampling peripheral normal instances (often difficult to detect), thereby enhancing the overall anomaly detection performance without requiring excessive time for the sampling process. Additionally, we present a joint reward mechanism, combined with inconsistency penalties, which optimizes both an agent’s action space and the representation space, ultimately improving the quality of the sampling process. Extensive experiments on four public datasets from different domains demonstrate the effectiveness and efficiency of our framework.

Original languageEnglish
Title of host publicationKDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages2586-2596
Number of pages11
ISBN (Electronic)9798400714542
DOIs
StatePublished - 3 Aug 2025
Event31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, Canada
Duration: 3 Aug 20257 Aug 2025

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Volume2
ISSN (Print)2154-817X

Conference

Conference31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
Country/TerritoryCanada
CityToronto
Period3/08/257/08/25

Keywords

  • Anomaly Detection
  • Reinforcement Learning
  • Representative Subset Selection
  • Semi-supervised Learning

Fingerprint

Dive into the research topics of 'READ: Robust and Efficient Anomaly Detection under Data Contamination and Limited Supervision'. Together they form a unique fingerprint.

Cite this