TY - JOUR
T1 - An improved algorithm for weighting keywords in web documents
AU - Sun, Shuang
AU - He, Liang
AU - Yang, Jing
AU - Gu, Jun Zhong
PY - 2008/6
Y1 - 2008/6
N2 - In this paper, an improved algorithm, web-based keyword weight algorithm (WKWA), is presented to weight keywords in web documents. WKWA takes into account representation features of web documents and advantages of the TF*IDF, TFC and ITC algorithms in order to make it more appropriate for web documents. Meanwhile, the presented algorithm is applied to improved vector space model (IVSM). A real system has been implemented for calculating semantic similarities of web documents. Four experiments have been carried out. They are keyword weight calculation, feature item selection, semantic similarity calculation, and WKWA time performance. The results demonstrate accuracy of keyword weight, and semantic similarity is improved.
AB - In this paper, an improved algorithm, web-based keyword weight algorithm (WKWA), is presented to weight keywords in web documents. WKWA takes into account representation features of web documents and advantages of the TF*IDF, TFC and ITC algorithms in order to make it more appropriate for web documents. Meanwhile, the presented algorithm is applied to improved vector space model (IVSM). A real system has been implemented for calculating semantic similarities of web documents. Four experiments have been carried out. They are keyword weight calculation, feature item selection, semantic similarity calculation, and WKWA time performance. The results demonstrate accuracy of keyword weight, and semantic similarity is improved.
KW - Feature item
KW - Improved vector space model (IVSM)
KW - Keyword weight
KW - Representation feature
KW - Semantic similarity
UR - https://www.scopus.com/pages/publications/45749113383
U2 - 10.1007/s11741-008-0309-2
DO - 10.1007/s11741-008-0309-2
M3 - 文章
AN - SCOPUS:45749113383
SN - 1007-6417
VL - 12
SP - 235
EP - 239
JO - Journal of Shanghai University
JF - Journal of Shanghai University
IS - 3
ER -