Topic exploration and distillation for web search by a similarity-based analysis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

Topic distillation is the process of finding representative pages relevant to a given query. Well-known topic distillation approaches such as the HITS algorithm have shown to be useful in identifying high quality pages. In this paper, we attempt to revisit the behaviour of HITS from a different point of view. Namely, a similarity-based analysis model is applied to observing the distillation procedure. By defining a generalized similarity, an algorithm is proposed, which can improve the quality of distillation using only hyperlinks. A topic exploration function is also integrated into the algorithm framework, which enables end-users to search less popular topics when multi-topics are involved in queries. The experimental results reveal two benefits from the new algorithm: the improvement of distillation quality without utilizing any content information of pages, and an additional ability to explore the topics emerging in the query results.

Original languageEnglish
Title of host publicationAdvances in Web-Age Information Management - 3rd International Conference, WAIM 2002, Proceedings
EditorsXiaofeng Meng, Jianwen Su, Yujun Wang
PublisherSpringer Verlag
Pages316-327
Number of pages12
ISBN (Print)9783540440451
DOIs
StatePublished - 2002
Externally publishedYes
Event3rd International Conference on Advances in Web-Age Information Management, WAIM 2002 - Beijing, China
Duration: 11 Aug 200213 Aug 2002

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume2419
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference3rd International Conference on Advances in Web-Age Information Management, WAIM 2002
Country/TerritoryChina
CityBeijing
Period11/08/0213/08/02

Fingerprint

Dive into the research topics of 'Topic exploration and distillation for web search by a similarity-based analysis'. Together they form a unique fingerprint.

Cite this