Deep Learning for Bidirectional Translation between Molecular Structures and Vibrational Spectra

Tianqing Hu, Zihan Zou, Bo Li, Tong Zhu, Shaonan Gu*, Jun Jiang*, Yi Luo*, Wei Hu*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

1 Scopus citations

Abstract

Two deep learning models, TranSpec and SpecGNN, were developed to establish a bidirectional mapping between molecular vibrational spectra and simplified molecular input line entry system (SMILES) representations, akin to a “translation” between the language of spectra and the language of molecular structures. Initially, TranSpec achieved accuracy rates of 55 and 63% for quantum chemistry (QC)-calculated IR and Raman spectral data sets, respectively, but its performance dropped to 11% for the NIST experimental IR data set. To address this, we combined IR and Raman spectra as input; augmented the data set; employed model fusion, transfer learning, and multisource learning; applied molecular mass filtering; and leveraged SpecGNN for spectral simulation and candidate reordering. These improvements boosted TranSpec’s accuracy to 53.6% for the experimental IR data set. Notably, SpecGNN outperformed traditional QC methods in terms of both spectral accuracy and computational efficiency. Finally, we demonstrated TranSpec’s ability to recognize functional groups and distinguish isomers or homologues. Together, TranSpec and SpecGNN models provide an efficient and accurate AI-driven framework for interpreting molecular structures and spectra, advancing applications in spectroscopy and cheminformatics.

Original languageEnglish
Pages (from-to)27525-27536
Number of pages12
JournalJournal of the American Chemical Society
Volume147
Issue number31
DOIs
StatePublished - 6 Aug 2025

Fingerprint

Dive into the research topics of 'Deep Learning for Bidirectional Translation between Molecular Structures and Vibrational Spectra'. Together they form a unique fingerprint.

Cite this