A Crystal Knowledge-Enhanced Pre-training Framework for Crystal Property Estimation

Haomin Yu, Yanru Song, Jilin Hu, Chenjuan Guo, Bin Yang, Christian S. Jensen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The design of new crystalline materials, or simply crystals, with desired properties relies on the ability to estimate the properties of crystals based on their structure. To advance the ability of machine learning (ML) to enable property estimation, we address two key limitations. First, creating labeled data for training entails time-consuming laboratory experiments and physical simulations, yielding a shortage of such data. To reduce the need for labeled training data, we propose a pre-training framework that adopts a mutually exclusive mask strategy, enabling models to discern underlying patterns. Second, crystal structures obey physical principles. To exploit the principle of periodic invariance, we propose multi-graph attention (MGA) and crystal knowledge-enhanced (CKE) modules. The MGA module considers different types of multi-graph edges to capture complex structural patterns. The CKE module incorporates periodic attribute learning and atom-type contrastive learning by explicitly introducing crystal knowledge to enhance crystal representation learning. We integrate these modules in a CRystal knOwledge-enhanced Pre-training (CROP) framework. Experiments on eight different datasets show that CROP is capable of promising estimation performance and can outperform strong baselines.

Original languageEnglish
Title of host publicationMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track - European Conference, ECML PKDD 2024, Proceedings
EditorsAlbert Bifet, Tomas Krilavičius, Ioanna Miliou, Slawomir Nowaczyk
PublisherSpringer Science and Business Media Deutschland GmbH
Pages231-246
Number of pages16
ISBN (Print)9783031703805
DOIs
StatePublished - 2024
EventEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024 - Vilnius, Lithuania
Duration: 9 Sep 202413 Sep 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14950 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

ConferenceEuropean Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2024
Country/TerritoryLithuania
CityVilnius
Period9/09/2413/09/24

Keywords

  • Crystal property
  • Knowledge-enhanced
  • Pre-training

Fingerprint

Dive into the research topics of 'A Crystal Knowledge-Enhanced Pre-training Framework for Crystal Property Estimation'. Together they form a unique fingerprint.

Cite this