Duplicate detection for identifying social spam in microblogs

Qunyan Zhang, Haixin Ma, Weining Qian, Aoying Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

15 Scopus citations

Abstract

As an important kind of social media, microblog has become an important source of opinion mining and collective behavior study. However, social spams may affect the analytical results greatly. This paper focuses on the problem of identifying potential social spammers who copy pieces of information from others. An improved locality-sensitive hashing based method is used for detecting duplicated tweets. Intensive empirical study over a real-life microblog dataset crawled from Sina Weibo, one of the most popular microblogging services, is conducted. The characteristics of potential spammers and their behaviors are analyzed.

Original languageEnglish
Title of host publicationProceedings - 2013 IEEE International Congress on Big Data, BigData 2013
Pages141-148
Number of pages8
DOIs
StatePublished - 2013
Event2013 IEEE International Congress on Big Data, BigData 2013 - Santa Clara, CA, United States
Duration: 27 Jun 20132 Jul 2013

Publication series

NameProceedings - 2013 IEEE International Congress on Big Data, BigData 2013

Conference

Conference2013 IEEE International Congress on Big Data, BigData 2013
Country/TerritoryUnited States
CitySanta Clara, CA
Period27/06/132/07/13

Keywords

  • MapReduce
  • duplicate detection
  • locality-sensitive hash
  • microblog
  • social spam

Fingerprint

Dive into the research topics of 'Duplicate detection for identifying social spam in microblogs'. Together they form a unique fingerprint.

Cite this