跳到主要导航 跳到搜索 跳到主要内容

A Two-Stage Pretraining-Finetuning Framework for Treatment Effect Estimation with Unmeasured Confounding

  • Chuan Zhou
  • , Yaxuan Li
  • , Chunyuan Zheng
  • , Haiteng Zhang
  • , Min Zhang
  • , Haoxuan Li*
  • , Mingming Gong*
  • *此作品的通讯作者
  • Peking University
  • Chinese Academy of Sciences
  • University of Melbourne

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

Estimating the conditional average treatment effect (CATE) from observational data plays a crucial role in areas such as e-commerce, healthcare, and economics. Existing studies mainly rely on the strong ignorability assumption that there are no unmeasured confounders, whose presence cannot be tested from observational data and can invalidate any causal conclusion. In contrast, data collected from randomized controlled trials (RCT) do not suffer from confounding, but are usually limited by a small sample size. In this paper, we propose a two-stage pretraining-finetuning (TSPF) framework using both large-scale observational data and small-scale RCT data to estimate the CATE in the presence of unmeasured confounding. In the first stage, a foundational representation of covariates is trained to estimate counterfactual outcomes through large-scale observational data. In the second stage, we propose to train an augmented representation of the covariates, which is concatenated to the foundational representation obtained in the first stage to adjust for the unmeasured confounding. To avoid overfitting caused by the small-scale RCT data in the second stage, we further propose a partial parameter initialization approach, rather than training a separate network. The superiority of our approach is validated on two public datasets with extensive experiments. The code is available at https://github.com/zhouchuanCN/KDD25-TSPF.

源语言英语
主期刊名KDD 2025 - Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining
出版商Association for Computing Machinery
2113-2123
页数11
ISBN(电子版)9798400712456
DOI
出版状态已出版 - 20 7月 2025
活动31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025 - Toronto, 加拿大
期限: 3 8月 20257 8月 2025

出版系列

姓名Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
1
ISSN(印刷版)2154-817X

会议

会议31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2025
国家/地区加拿大
Toronto
时期3/08/257/08/25

指纹

探究 'A Two-Stage Pretraining-Finetuning Framework for Treatment Effect Estimation with Unmeasured Confounding' 的科研主题。它们共同构成独一无二的指纹。

引用此