跳到主要导航 跳到搜索 跳到主要内容

Improving Encarta search engine performance by mining user logs

  • Charles X. Ling*
  • , Jianfeng Gao
  • , Huajie Zhang
  • , Weining Qian
  • , Hongjiang Zhang
  • *此作品的通讯作者
  • Western University
  • University of New Brunswick
  • Fudan University

科研成果: 期刊稿件文章同行评审

摘要

We propose a data-mining approach that produces generalized query patterns (with generalized keywords) from the raw user logs of the Microsoft Encarta search engine (http://encarta.msn.com). Those query patterns can act as cache of the search engine, improving its performance. The cache of the generalized query patterns is more advantageous than the cache of the most frequent user queries since our patterns are generalized, covering more queries and future queries - even those not previously asked. Our method is unique since query patterns discovered reflect the actual dynamic usage and user feedbacks of the search engine, rather than the syntactic linkage structure of web pages (as Google does). Simulation shows that such generalized query patterns improve search engine's overall speed considerably. The generalized query patterns, when viewed with a graphical user interface, are also helpful to web editors, who can easily discover topics in which users are mostly interested.

源语言英语
页(从-至)1101-1116
页数16
期刊International Journal of Pattern Recognition and Artificial Intelligence
16
8
DOI
出版状态已出版 - 12月 2002
已对外发布

指纹

探究 'Improving Encarta search engine performance by mining user logs' 的科研主题。它们共同构成独一无二的指纹。

引用此