跳到主要导航 跳到搜索 跳到主要内容

Ens-Chemage: Robust Molecular Image-Based Ensemble Transfer Learning Framework for Small Contaminant Property Data Sets

  • East China Normal University

科研成果: 期刊稿件文章同行评审

摘要

Contaminant property data sets are typically small, posing challenges for developing accurate deep learning (DL) models. In this study, we pretrained ResNet18 models on the PubChem data set (∼10 million molecules) using molecular RGB images as inputs and their MACCS fingerprints as labels, generating five models (Chemage1 to Chemage5) with various pretraining accuracy, and fine-tuned them on 10 MoleculeNet and 12 contaminant property data sets. We found that appropriate model architectures and fine-tuning techniques significantly improve the transfer learning efficacy. We then developed an ensemble model, Ens-Chemage, to combine the strengths of these individual models. Ens-Chemage outperformed conventional machine learning (ML) models and ImageMol on almost all tested data sets. Through model interpretation, we found that Ens-Chemage learned more accurate and distinct features than the other models. Additionally, we defined its applicability domain (AD) by using an uncertainty-based approach. Finally, Ens-Chemage has been deployed for free public use at https://ens-chemage.streamlit.app/. This study presents significant advancements in the application of DL for small contaminant property data sets.

源语言英语
页(从-至)1200-1206
页数7
期刊Environmental Science and Technology Letters
11
11
DOI
出版状态已出版 - 12 11月 2024

指纹

探究 'Ens-Chemage: Robust Molecular Image-Based Ensemble Transfer Learning Framework for Small Contaminant Property Data Sets' 的科研主题。它们共同构成独一无二的指纹。

引用此