TY - GEN
T1 - READ
T2 - 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
AU - Shou, Hongzhe
AU - Lu, Guanyu
AU - Pavlovski, Martin
AU - Zhou, Fang
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.
PY - 2025/8/3
Y1 - 2025/8/3
N2 - Existing anomaly detection methods tend to utilize a large amount of training data to learn patterns of normal data for effective anomaly identification, but such methods typically incur substantial training time overhead. Considering that unlabeled data often contains a lot of redundant information, selecting and utilizing a small yet representative subset instead of the entire dataset can significantly improve training efficiency while maintaining detection performance. To this end, we introduce an end-to-end reinforcement learning framework with a balanced sampling strategy that targets both normal and abnormal instances. This framework identifies and exploits potential anomalies in the unlabeled data while sampling peripheral normal instances (often difficult to detect), thereby enhancing the overall anomaly detection performance without requiring excessive time for the sampling process. Additionally, we present a joint reward mechanism, combined with inconsistency penalties, which optimizes both an agent’s action space and the representation space, ultimately improving the quality of the sampling process. Extensive experiments on four public datasets from different domains demonstrate the effectiveness and efficiency of our framework.
AB - Existing anomaly detection methods tend to utilize a large amount of training data to learn patterns of normal data for effective anomaly identification, but such methods typically incur substantial training time overhead. Considering that unlabeled data often contains a lot of redundant information, selecting and utilizing a small yet representative subset instead of the entire dataset can significantly improve training efficiency while maintaining detection performance. To this end, we introduce an end-to-end reinforcement learning framework with a balanced sampling strategy that targets both normal and abnormal instances. This framework identifies and exploits potential anomalies in the unlabeled data while sampling peripheral normal instances (often difficult to detect), thereby enhancing the overall anomaly detection performance without requiring excessive time for the sampling process. Additionally, we present a joint reward mechanism, combined with inconsistency penalties, which optimizes both an agent’s action space and the representation space, ultimately improving the quality of the sampling process. Extensive experiments on four public datasets from different domains demonstrate the effectiveness and efficiency of our framework.
KW - Anomaly Detection
KW - Reinforcement Learning
KW - Representative Subset Selection
KW - Semi-supervised Learning
UR - https://www.scopus.com/pages/publications/105014313948
U2 - 10.1145/3711896.3737100
DO - 10.1145/3711896.3737100
M3 - 会议稿件
AN - SCOPUS:105014313948
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 2586
EP - 2596
BT - KDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 3 August 2025 through 7 August 2025
ER -