TY - GEN
T1 - Temporal skeletonization on sequential data
T2 - 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014
AU - Liu, Chuanren
AU - Zhang, Kai
AU - Xiong, Hui
AU - Jiang, Geoff
AU - Yang, Qiang
PY - 2014
Y1 - 2014
N2 - Sequential pattern analysis targets on finding statistically relevant temporal structures where the values are delivered in a sequence. With the growing complexity of real-world dynamic scenarios, more and more symbols are often needed to encode a meaningful sequence. This is so-called 'curse of cardinality', which can impose significant challenges to the design of sequential analysis methods in terms of computational efficiency and practical use. Indeed, given the overwhelming scale and the heterogeneous nature of the sequential data, new visions and strategies are needed to face the challenges. To this end, in this paper, we propose a 'temporal skeletonization' approach to proactively reduce the representation of sequences to uncover significant, hidden temporal structures. The key idea is to summarize the temporal correlations in an undirected graph. Then, the 'skeleton' of the graph serves as a higher granularity on which hidden temporal patterns are more likely to be identified. In the meantime, the embedding topology of the graph allows us to translate the rich temporal content into a metric space. This opens up new possibilities to explore, quantify, and visualize sequential data. Our approach has shown to greatly alleviate the curse of cardinality in challenging tasks of sequential pattern mining and clustering. Evaluation on a Business-to-Business (B2B) marketing application demonstrates that our approach can effectively discover critical buying paths from noisy customer event data.
AB - Sequential pattern analysis targets on finding statistically relevant temporal structures where the values are delivered in a sequence. With the growing complexity of real-world dynamic scenarios, more and more symbols are often needed to encode a meaningful sequence. This is so-called 'curse of cardinality', which can impose significant challenges to the design of sequential analysis methods in terms of computational efficiency and practical use. Indeed, given the overwhelming scale and the heterogeneous nature of the sequential data, new visions and strategies are needed to face the challenges. To this end, in this paper, we propose a 'temporal skeletonization' approach to proactively reduce the representation of sequences to uncover significant, hidden temporal structures. The key idea is to summarize the temporal correlations in an undirected graph. Then, the 'skeleton' of the graph serves as a higher granularity on which hidden temporal patterns are more likely to be identified. In the meantime, the embedding topology of the graph allows us to translate the rich temporal content into a metric space. This opens up new possibilities to explore, quantify, and visualize sequential data. Our approach has shown to greatly alleviate the curse of cardinality in challenging tasks of sequential pattern mining and clustering. Evaluation on a Business-to-Business (B2B) marketing application demonstrates that our approach can effectively discover critical buying paths from noisy customer event data.
KW - curse of cardinality
KW - network embedding
KW - sequential pattern mining
KW - temporal skeletonization
UR - https://www.scopus.com/pages/publications/84907023043
U2 - 10.1145/2623330.2623741
DO - 10.1145/2623330.2623741
M3 - 会议稿件
AN - SCOPUS:84907023043
SN - 9781450329569
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 1336
EP - 1345
BT - KDD 2014 - Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 24 August 2014 through 27 August 2014
ER -