TY - GEN
T1 - How to Avoid Jumping to Conclusions
T2 - 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2024
AU - Xiao, Hanhua
AU - Li, Yuchen
AU - Wang, Yanhao
AU - Karras, Panagiotis
AU - Mouratidis, Kyriakos
AU - Avlona, Natalia Rozalia
N1 - Publisher Copyright:
© 2024 Copyright held by the owner/author(s).
PY - 2024/8/24
Y1 - 2024/8/24
N2 - An outstanding fact (OF) is a striking claim by which some entities stand out from their peers on some attribute. OFs serve data journalism, fact checking, and recommendation. However, one could jump to conclusions by selecting truthful OFs while intentionally or inadvertently ignoring lateral contexts and data that render them less striking. This jumping conclusion bias from unstable OFs may disorient the public, including voters and consumers, raising concerns about fairness and transparency in political and business competition. It is thus ethically imperative for several stakeholders to measure the robustness of OFs with respect to lateral contexts and data. Unfortunately, a capacity for such inspection of OFs mined from knowledge graphs (KGs) is missing. In this paper, we propose a methodology that inspects the robustness of OFs in KGs by perturbation analysis. We define (1) entity perturbation, which detects outlying contexts by perturbing context entities in the OF; and (2) data perturbation, which considers plausible data that render an OF less striking. We compute the expected strikingness scores of OFs over perturbation relevance distributions and assess an OF as robust if its measured strikingness does not deviate significantly from the expected. We devise a suite of exact and sampling algorithms for perturbation analysis on large KGs. Extensive experiments reveal that our methodology accurately and efficiently detects frail OFs generated by existing mining approaches on KGs. We also show the effectiveness of our approaches through case and user studies.
AB - An outstanding fact (OF) is a striking claim by which some entities stand out from their peers on some attribute. OFs serve data journalism, fact checking, and recommendation. However, one could jump to conclusions by selecting truthful OFs while intentionally or inadvertently ignoring lateral contexts and data that render them less striking. This jumping conclusion bias from unstable OFs may disorient the public, including voters and consumers, raising concerns about fairness and transparency in political and business competition. It is thus ethically imperative for several stakeholders to measure the robustness of OFs with respect to lateral contexts and data. Unfortunately, a capacity for such inspection of OFs mined from knowledge graphs (KGs) is missing. In this paper, we propose a methodology that inspects the robustness of OFs in KGs by perturbation analysis. We define (1) entity perturbation, which detects outlying contexts by perturbing context entities in the OF; and (2) data perturbation, which considers plausible data that render an OF less striking. We compute the expected strikingness scores of OFs over perturbation relevance distributions and assess an OF as robust if its measured strikingness does not deviate significantly from the expected. We devise a suite of exact and sampling algorithms for perturbation analysis on large KGs. Extensive experiments reveal that our methodology accurately and efficiently detects frail OFs generated by existing mining approaches on KGs. We also show the effectiveness of our approaches through case and user studies.
KW - knowledge graphs
KW - outstanding facts
KW - perturbation analysis
KW - robustness measurement
UR - https://www.scopus.com/pages/publications/85203689956
U2 - 10.1145/3637528.3671763
DO - 10.1145/3637528.3671763
M3 - 会议稿件
AN - SCOPUS:85203689956
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 3539
EP - 3550
BT - KDD 2024 - Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
Y2 - 25 August 2024 through 29 August 2024
ER -