TY - JOUR
T1 - Strong and Weak Identifiability of Optimization-based Causal Discovery in Non-linear Additive Noise Models
AU - Li, Mingjia
AU - Qian, Hong
AU - Wang, Tian Zuo
AU - Li, Shujun
AU - Zhang, Min
AU - Zhou, Aimin
N1 - Publisher Copyright:
© 2025 by the author(s).
PY - 2025
Y1 - 2025
N2 - Causal discovery aims to identify causal relationships from observational data. Recently, optimization-based causal discovery methods have attracted extensive attention in the literature due to their efficiency in handling high dimensional problems. However, we observe that optimization-based methods often perform well on certain problems but struggle with others. This paper identifies a specific characteristic of causal structural equations that determines the difficulty of identification in causal discovery and, in turn, the performance of optimization-based methods. We conduct an in-depth study of the additive noise model (ANM) and propose to further divide identifiable problems into strongly and weakly identifiable types based on the difficulty of identification. We also provide a sufficient condition to distinguish the two categories. Inspired by these findings, this paper further proposes GENE, a generic method for addressing strongly and weakly identifiable problems in a unified way under the ANM assumption. GENE adopts an order-based search framework that incorporates conditional independence tests into order fitness evaluation, ensuring effectiveness on weakly identifiable problems. In addition, GENE restricts the dimensionality of the effect variables to ensure scale invariance, a property crucial for practical applications. Experiments demonstrate that GENE is uniquely effective in addressing weakly identifiable problems while also remaining competitive with state-of the- art causal discovery algorithms for stronglyidentifiable problems.
AB - Causal discovery aims to identify causal relationships from observational data. Recently, optimization-based causal discovery methods have attracted extensive attention in the literature due to their efficiency in handling high dimensional problems. However, we observe that optimization-based methods often perform well on certain problems but struggle with others. This paper identifies a specific characteristic of causal structural equations that determines the difficulty of identification in causal discovery and, in turn, the performance of optimization-based methods. We conduct an in-depth study of the additive noise model (ANM) and propose to further divide identifiable problems into strongly and weakly identifiable types based on the difficulty of identification. We also provide a sufficient condition to distinguish the two categories. Inspired by these findings, this paper further proposes GENE, a generic method for addressing strongly and weakly identifiable problems in a unified way under the ANM assumption. GENE adopts an order-based search framework that incorporates conditional independence tests into order fitness evaluation, ensuring effectiveness on weakly identifiable problems. In addition, GENE restricts the dimensionality of the effect variables to ensure scale invariance, a property crucial for practical applications. Experiments demonstrate that GENE is uniquely effective in addressing weakly identifiable problems while also remaining competitive with state-of the- art causal discovery algorithms for stronglyidentifiable problems.
UR - https://www.scopus.com/pages/publications/105023201096
M3 - 会议文章
AN - SCOPUS:105023201096
SN - 2640-3498
VL - 267
SP - 35753
EP - 35768
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 42nd International Conference on Machine Learning, ICML 2025
Y2 - 13 July 2025 through 19 July 2025
ER -