TY - GEN
T1 - On benchmarking online social media analytical queries
AU - Ma, Haixin
AU - Qian, Weining
AU - Xia, Fan
AU - Wei, Jinxian
AU - Yu, Chengcheng
AU - Zhou, Aoying
PY - 2013
Y1 - 2013
N2 - Social media analytics has many applications in collective behavior sensing and monitoring, online advertisement, opinion mining, and etc. Though a number of technologies and systems are proposed for analyzing social media data, the overall performance and the advantages of those technologies and systems are not compared under similar settings. In this paper, a benchmark named as BSMA, for Benchmarking Social Media Analytics, is proposed. It distinguishes with other similar effort in that: 1) A real-life dataset with activties of more than 1.6 million users in 2 years and fol-lowship relationships of 1.2 billion users is used. The distributions of data in the dataset is different from those of data generators. 2) 19 queries fitting into three categories, i.e. social network quries, hotspot queries, and timeline queries, are used. The three categories each poses challenge to different part of testing systems. 3) Measurements of throughput, latency, and scalability are used for testing performance. A toolkit for reporting measurement values that are based on YCSB is developed. A previous version of BSMA is used in WISE 2012 Challenge. Four teams implemented all or part of the 19 queries. Their results are analyzed in this paper. The progress and future work of BSMA is also discussed.
AB - Social media analytics has many applications in collective behavior sensing and monitoring, online advertisement, opinion mining, and etc. Though a number of technologies and systems are proposed for analyzing social media data, the overall performance and the advantages of those technologies and systems are not compared under similar settings. In this paper, a benchmark named as BSMA, for Benchmarking Social Media Analytics, is proposed. It distinguishes with other similar effort in that: 1) A real-life dataset with activties of more than 1.6 million users in 2 years and fol-lowship relationships of 1.2 billion users is used. The distributions of data in the dataset is different from those of data generators. 2) 19 queries fitting into three categories, i.e. social network quries, hotspot queries, and timeline queries, are used. The three categories each poses challenge to different part of testing systems. 3) Measurements of throughput, latency, and scalability are used for testing performance. A toolkit for reporting measurement values that are based on YCSB is developed. A previous version of BSMA is used in WISE 2012 Challenge. Four teams implemented all or part of the 19 queries. Their results are analyzed in this paper. The progress and future work of BSMA is also discussed.
UR - https://www.scopus.com/pages/publications/84880517355
U2 - 10.1145/2484425.2484435
DO - 10.1145/2484425.2484435
M3 - 会议稿件
AN - SCOPUS:84880517355
SN - 9781450321884
T3 - 1st International Workshop on Graph Data Management Experiences and Systems, GRADES 2013 - Co-located with SIGMOD/PODS 2013
BT - 1st International Workshop on Graph Data Management Experiences and Systems, GRADES 2013 - Co-located with SIGMOD/PODS 2013
T2 - 1st International Workshop on Graph Data Management Experiences and Systems, GRADES 2013 - Co-located with SIGMOD/PODS 2013
Y2 - 23 June 2013 through 23 June 2013
ER -