TY - GEN
T1 - Enhancing the healthcare retrieval with a self-adaptive saturated density function
AU - Song, Yang
AU - Hu, Wenxin
AU - He, Liang
AU - Dou, Liang
N1 - Publisher Copyright:
© Springer Nature Switzerland AG 2019.
PY - 2019
Y1 - 2019
N2 - The proximity based information retrieval models usually use the same pre-define density function for all of terms in the collection to estimate their influence distribution. In healthcare domain, however, different terms in the same document have different influence distributions, the same term in different documents also has different influence distributions, and the pre-defined density function may not completely match the terms’ actual influence distributions. In this paper, we define a saturated density function to measure the best suitable density function that fits the given term’s influence distribution, and propose a self-adaptive approach on saturated density function building for each term in various circumstance. Particularly, our approach utilizing Gamma process is an unsupervised model with no requirements for external resources. Then, we construct a density based weighting method for the purpose of evaluating the effectiveness of our approach. Finally, we conduct our experiment on five standard CLEF and TREC datasets, and the experimental results show that our approach is promising and outperforms the pre-defined density functions in healthcare retrieval.
AB - The proximity based information retrieval models usually use the same pre-define density function for all of terms in the collection to estimate their influence distribution. In healthcare domain, however, different terms in the same document have different influence distributions, the same term in different documents also has different influence distributions, and the pre-defined density function may not completely match the terms’ actual influence distributions. In this paper, we define a saturated density function to measure the best suitable density function that fits the given term’s influence distribution, and propose a self-adaptive approach on saturated density function building for each term in various circumstance. Particularly, our approach utilizing Gamma process is an unsupervised model with no requirements for external resources. Then, we construct a density based weighting method for the purpose of evaluating the effectiveness of our approach. Finally, we conduct our experiment on five standard CLEF and TREC datasets, and the experimental results show that our approach is promising and outperforms the pre-defined density functions in healthcare retrieval.
KW - Information retrieval
KW - Saturated density function
KW - Self-adaptive
UR - https://www.scopus.com/pages/publications/85064952330
U2 - 10.1007/978-3-030-16148-4_39
DO - 10.1007/978-3-030-16148-4_39
M3 - 会议稿件
AN - SCOPUS:85064952330
SN - 9783030161477
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 501
EP - 513
BT - Advances in Knowledge Discovery and Data Mining - 23rd Pacific-Asia Conference, PAKDD 2019, Proceedings
A2 - Gong, Zhiguo
A2 - Zhang, Min-Ling
A2 - Zhou, Zhi-Hua
A2 - Yang, Qiang
A2 - Huang, Sheng-Jun
PB - Springer Verlag
T2 - 23rd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2019
Y2 - 14 April 2019 through 17 April 2019
ER -