Content trust model for detecting web spam

  • Wei Wang*
  • , Guosun Zeng
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

As it gets easier to add information to the web via html pages, wikis, blogs, and other documents, it gets tougher to distinguish accurate or trustworthy information from inaccurate or untrustworthy information. Moreover, apart from inaccurate or untrustworthy information, we also need to anticipate web spam - where spammers publish false facts and scams to deliberately mislead users. Creating an effective spam detection method is a challenge. In this paper, we use the notion of content trust for spam detection, and regard it as a ranking problem. Evidence is utilized to define the feature of spam web pages, and machine learning techniques are employed to combine the evidence to create a highly efficient and reasonably-accurate spam detection algorithm. Experiments on real web data are carried out, which show the proposed method performs very well in practice.

Original languageEnglish
Title of host publicationTrust Management
Subtitle of host publicationProceedings of IFIPTM 2007: Joint iTrust and PST Conferences on Privacy, Trust Management and Security, July 30- August 2, 2007, New Brunswick, Canada
EditorsSandro Etalle, Stephen Marsh
Pages139-152
Number of pages14
DOIs
StatePublished - 2007
Externally publishedYes

Publication series

NameIFIP International Federation for Information Processing
Volume238
ISSN (Print)1571-5736

Fingerprint

Dive into the research topics of 'Content trust model for detecting web spam'. Together they form a unique fingerprint.

Cite this