TY - GEN
T1 - A Crystal Knowledge-Enhanced Pre-training Framework for Crystal Property Estimation
AU - Yu, Haomin
AU - Song, Yanru
AU - Hu, Jilin
AU - Guo, Chenjuan
AU - Yang, Bin
AU - Jensen, Christian S.
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - The design of new crystalline materials, or simply crystals, with desired properties relies on the ability to estimate the properties of crystals based on their structure. To advance the ability of machine learning (ML) to enable property estimation, we address two key limitations. First, creating labeled data for training entails time-consuming laboratory experiments and physical simulations, yielding a shortage of such data. To reduce the need for labeled training data, we propose a pre-training framework that adopts a mutually exclusive mask strategy, enabling models to discern underlying patterns. Second, crystal structures obey physical principles. To exploit the principle of periodic invariance, we propose multi-graph attention (MGA) and crystal knowledge-enhanced (CKE) modules. The MGA module considers different types of multi-graph edges to capture complex structural patterns. The CKE module incorporates periodic attribute learning and atom-type contrastive learning by explicitly introducing crystal knowledge to enhance crystal representation learning. We integrate these modules in a CRystal knOwledge-enhanced Pre-training (CROP) framework. Experiments on eight different datasets show that CROP is capable of promising estimation performance and can outperform strong baselines.
AB - The design of new crystalline materials, or simply crystals, with desired properties relies on the ability to estimate the properties of crystals based on their structure. To advance the ability of machine learning (ML) to enable property estimation, we address two key limitations. First, creating labeled data for training entails time-consuming laboratory experiments and physical simulations, yielding a shortage of such data. To reduce the need for labeled training data, we propose a pre-training framework that adopts a mutually exclusive mask strategy, enabling models to discern underlying patterns. Second, crystal structures obey physical principles. To exploit the principle of periodic invariance, we propose multi-graph attention (MGA) and crystal knowledge-enhanced (CKE) modules. The MGA module considers different types of multi-graph edges to capture complex structural patterns. The CKE module incorporates periodic attribute learning and atom-type contrastive learning by explicitly introducing crystal knowledge to enhance crystal representation learning. We integrate these modules in a CRystal knOwledge-enhanced Pre-training (CROP) framework. Experiments on eight different datasets show that CROP is capable of promising estimation performance and can outperform strong baselines.
KW - Crystal property
KW - Knowledge-enhanced
KW - Pre-training
UR - https://www.scopus.com/pages/publications/85203870434
U2 - 10.1007/978-3-031-70381-2_15
DO - 10.1007/978-3-031-70381-2_15
M3 - 会议稿件
AN - SCOPUS:85203870434
SN - 9783031703805
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 231
EP - 246
BT - Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track - European Conference, ECML PKDD 2024, Proceedings
A2 - Bifet, Albert
A2 - Krilavičius, Tomas
A2 - Miliou, Ioanna
A2 - Nowaczyk, Slawomir
PB - Springer Science and Business Media Deutschland GmbH
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Y2 - 9 September 2024 through 13 September 2024
ER -