TY - GEN
T1 - Practical duplicate bug reports detection in a large web-based development community
AU - Feng, Liang
AU - Song, Leyi
AU - Sha, Chaofeng
AU - Gong, Xueqing
PY - 2013
Y1 - 2013
N2 - Most of large web-based development communities require a bug tracking system to keep track of various bug reports. However, duplicate bug reports tend to result in waste of resources, and may cause potential conflicts. There have been two types of works focusing on this problem: relevant bug report retrieval [8][11][10][13] and duplicate bug report identification [5][12]. The former methods can achieve high accuracy (82%) in the top 10 results in some dataset, but they do not really reduce the workload of developers. The latter methods still need further improvement on the performance. In this paper, we propose a practical duplicate bug reports detection method, which aims to help project team to reduce their workload by combining existing two categories of methods. We also propose some new features extracted from comments, user profiles and query feedback, which are useful for improving the detection performance. Experiments on real dataset show that our method improves the accuracy rate by 23% compared to state-of-the-art work in duplicate bug report identification, and improves the recall rate by up to 8% in relevant bug report retrieval.
AB - Most of large web-based development communities require a bug tracking system to keep track of various bug reports. However, duplicate bug reports tend to result in waste of resources, and may cause potential conflicts. There have been two types of works focusing on this problem: relevant bug report retrieval [8][11][10][13] and duplicate bug report identification [5][12]. The former methods can achieve high accuracy (82%) in the top 10 results in some dataset, but they do not really reduce the workload of developers. The latter methods still need further improvement on the performance. In this paper, we propose a practical duplicate bug reports detection method, which aims to help project team to reduce their workload by combining existing two categories of methods. We also propose some new features extracted from comments, user profiles and query feedback, which are useful for improving the detection performance. Experiments on real dataset show that our method improves the accuracy rate by 23% compared to state-of-the-art work in duplicate bug report identification, and improves the recall rate by up to 8% in relevant bug report retrieval.
KW - Bug Report
KW - Classification
KW - Duplicate Detection
UR - https://www.scopus.com/pages/publications/84875826407
U2 - 10.1007/978-3-642-37401-2_69
DO - 10.1007/978-3-642-37401-2_69
M3 - 会议稿件
AN - SCOPUS:84875826407
SN - 9783642374005
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 709
EP - 720
BT - Web Technologies and Applications - 15th Asia-Pacific Web Conference, APWeb 2013, Proceedings
T2 - 15th Asia-Pacific Web Conference on Web Technologies and Applications, APWeb 2013
Y2 - 4 April 2013 through 6 April 2013
ER -