TY - GEN
T1 - Graph Contrastive Learning for Truth Inference
AU - Liu, Hao
AU - Liu, Jiacheng
AU - Tang, Feilong
AU - Li, Peng
AU - Chen, Long
AU - Yu, Jiadi
AU - Zhu, Yanmin
AU - Gao, Min
AU - Yang, Yanqin
AU - Hou, Xiaofeng
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Crowdsourcing has become a popular paradigm for collecting large-scale labeled datasets by leveraging numerous annotators. However, these annotators often provide noisy labels due to varying expertise. Truth inference aims to infer accurate consensus labels from noisy crowdsourced annotations. Existing approaches rely heavily on hand-engineered assumptions or ground truth data, limiting their applicability. To address this, we propose GOVERN, a graph contrastive learning framework for truth inference without such external supervision. GOVERN employs a novel graph data augmentation strategy to generate views capturing worker coordination patterns. A contrastive objective then encourages invariant representations across views, enabling the discovery of features related to the hidden consensus. Further, a label correction method based on k-nearest neighbors refines noisy pseudo-labels to supervise model training. Comprehensive experiments on 9 real-world datasets demonstrate that GOVERN outperforms state-of-the-art truth inference techniques.
AB - Crowdsourcing has become a popular paradigm for collecting large-scale labeled datasets by leveraging numerous annotators. However, these annotators often provide noisy labels due to varying expertise. Truth inference aims to infer accurate consensus labels from noisy crowdsourced annotations. Existing approaches rely heavily on hand-engineered assumptions or ground truth data, limiting their applicability. To address this, we propose GOVERN, a graph contrastive learning framework for truth inference without such external supervision. GOVERN employs a novel graph data augmentation strategy to generate views capturing worker coordination patterns. A contrastive objective then encourages invariant representations across views, enabling the discovery of features related to the hidden consensus. Further, a label correction method based on k-nearest neighbors refines noisy pseudo-labels to supervise model training. Comprehensive experiments on 9 real-world datasets demonstrate that GOVERN outperforms state-of-the-art truth inference techniques.
UR - https://www.scopus.com/pages/publications/85200495065
U2 - 10.1109/ICDE60146.2024.00027
DO - 10.1109/ICDE60146.2024.00027
M3 - 会议稿件
AN - SCOPUS:85200495065
T3 - Proceedings - International Conference on Data Engineering
SP - 263
EP - 275
BT - Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024
PB - IEEE Computer Society
T2 - 40th IEEE International Conference on Data Engineering, ICDE 2024
Y2 - 13 May 2024 through 17 May 2024
ER -