Top-k temporal keyword search over social media data

Research output: Contribution to journalArticlepeer-review

5 Scopus citations

Abstract

Social media services have already become main sources for monitoring emerging topics and sensing real-life events. A social media platform manages social stream consisting of a huge volume of timestamped user generated data, including original data and repost data. However, previous research on keyword search over social media data mainly emphasizes on the recency of information. In this paper, we first propose a problem of top-k most significant temporal keyword query to enable more complex query analysis. It returns top-k most popular social items that contain the keywords in the given query time window. Then, we design a temporal inverted index with two-tiers posting list to index social time series and a segment store to compute the exact social significance of social items. Next, we implement a basic query algorithm based on our proposed index structure and give a detailed performance analysis on the query algorithm. From the analysis result, we further refine our query algorithm with a piecewise maximum approximation (PMA) sketch. Finally, extensive empirical studies on a real-life microblog dataset demonstrate the combination of two-tiers posting list and PMA sketch achieves remarkable performance improvement under different query settings.

Original languageEnglish
Pages (from-to)1049-1069
Number of pages21
JournalWorld Wide Web
Volume20
Issue number5
DOIs
StatePublished - 1 Sep 2017

Keywords

  • Social media
  • Temporal keyword query
  • Top-k query

Fingerprint

Dive into the research topics of 'Top-k temporal keyword search over social media data'. Together they form a unique fingerprint.

Cite this