Entity embedding-based anomaly detection for heterogeneous categorical events

  • Ting Chen
  • , Lu An Tang
  • , Yizhou Sun
  • , Zhengzhang Chen
  • , Kai Zhang

Research output: Contribution to journalConference articlepeer-review

60 Scopus citations

Abstract

Anomaly detection plays an important role in modern data-driven security applications, such as detecting suspicious access to a socket from a process. In many cases, such events can be described as a collection of categorical values that are considered as entities of different types, which we call heterogeneous categorical events. Due to the lack of intrinsic distance measures among entities, and the exponentially large event space, most existing work relies heavily on heuristics to calculate abnormal scores for events. Different from previous work, we propose a principled and unified probabilistic model APE (Anomaly detection via Probabilistic pairwise interaction and Entity embedding) that directly models the likelihood of events. In this model, we embed entities into a common latent space using their observed co-occurrence in different events. More specifically, we first model the compatibility of each pair of entities according to their embeddings. Then we utilize the weighted pairwise interactions of different entity types to define the event probability. Using Noise-Contrastive Estimation with "context-dependent" noise distribution, our model can be learned efficiently regardless of the large event space. Experimental results on real enterprise surveillance data show that our methods can accurately detect abnormal events compared to other state-of-the-art abnormal detection techniques.

Original languageEnglish
Pages (from-to)1396-1403
Number of pages8
JournalIJCAI International Joint Conference on Artificial Intelligence
Volume2016-January
StatePublished - 2016
Externally publishedYes
Event25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, United States
Duration: 9 Jul 201615 Jul 2016

Fingerprint

Dive into the research topics of 'Entity embedding-based anomaly detection for heterogeneous categorical events'. Together they form a unique fingerprint.

Cite this