Optimizing top-K Retrieval: Submodularity analysis and search strategies

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The key issue in top-k retrieval - finding a set of k documents (from a large document collection) that can best answer a user's query - is to strike the optimal balance between relevance and diversity. In this paper, we study the top-k retrieval problem in the framework of facility location analysis and prove the submodularity of that objective function which provides a theoretical approximation guarantee of factor for the (best-first) greedy search algorithm. Furthermore, we propose a two-stage hybrid search strategy which first obtains a high-quality initial set of top-k documents via greedy search, and then refines that result set iteratively via local search. Experiments on two large TREC benchmark datasets show that our two-stage hybrid search strategy approach outperforms the existing ones.

Original languageEnglish
Title of host publicationWeb-Age Information Management - 15th International Conference, WAIM 2014, Proceedings
PublisherSpringer Verlag
Pages18-29
Number of pages12
ISBN (Print)9783319080093
DOIs
StatePublished - 2014
Event15th International Conference on Web-Age Information Management, WAIM 2014 - Macau, China
Duration: 16 Jun 201418 Jun 2014

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume8485 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th International Conference on Web-Age Information Management, WAIM 2014
Country/TerritoryChina
CityMacau
Period16/06/1418/06/14

Fingerprint

Dive into the research topics of 'Optimizing top-K Retrieval: Submodularity analysis and search strategies'. Together they form a unique fingerprint.

Cite this