摘要
The development of robust machine learning models to assist the prediction and optimization of homogeneously catalyzed reactions has attracted wide interests. In this work, we propose a workflow to estimate the linear to branched ratio of the products in hydroformylation reactions using a stacking ensemble method that integrates Random Forest, eXtreme Gradient Boosting, and Light Gradient Boosting Machine algorithms, leveraging physicochemically significant features from small-batch experimental data. The stacking model achieves superior performance with R2 and Root Mean Square Error values of 0.918 and 0.078, respectively. Moreover, the SHapley Additive exPlanations analysis and density functional theory calculations reveal the significant impact of the gap values between the highest occupied molecular orbital and lowest unoccupied molecular orbital of alkenes on the regioselectivity of hydroformylation reactions, indicating that larger gap values tend to result in a higher proportion of the linear products. This study illustrates that the combination of physicochemically significant features and interpretable ensemble models can serve as a useful strategy for predicting regioselectivity in homogeneously catalyzed reactions.
| 源语言 | 英语 |
|---|---|
| 页(从-至) | 61-74 |
| 页数 | 14 |
| 期刊 | Pure and Applied Chemistry |
| 卷 | 98 |
| 期 | 1 |
| DOI | |
| 出版状态 | 已出版 - 1 1月 2026 |
指纹
探究 'Machine learning predictions for regioselectivity of hydroformylation reactions: leveraging limited data for high-precision results' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver