TY - JOUR
T1 - Using Machine Learning and GPT Models To Enhance Electrochemical Pretreatment of Anaerobic Cofermentation
T2 - Prediction, Early Warning, and Biomarker Identification
AU - Jiang, Jinqi
AU - Lin, Qingshan
AU - Guan, Xiaohong
AU - Zhou, Shuai
AU - Zhong, Shifa
AU - Xiang, Xiang
AU - Wang, Zongping
AU - Chen, Guanghao
AU - Guo, Gang
N1 - Publisher Copyright:
© 2025 American Chemical Society.
PY - 2025/5/9
Y1 - 2025/5/9
N2 - Electrochemical enhancing anaerobic cofermentation of waste activated sludge and food waste to produce volatile fatty acids (VFAs) represents an innovative and promising approach. Despite its potential, optimizing system performance, providing early warnings, and identifying biomarkers remain challenging tasks due to the intricate interplay of numerous environmental variables and unclear dynamics of microbial interactions. This study first employed machine learning (ML) models including XGBoost, random forest (RF), support vector regression (SVR), and CatBoost to forecast VFA production by integrating initial feedstock properties, electrochemical pretreatment conditions, and fermentation parameters. CatBoost demonstrated the highest R2 of 0.977 and the lowest root-mean-square error (RMSE) at 95.69 mg COD/L. Key environmental factors, including fermentation days (VFA production reaching 90% by day 5), salinity (0.5-1.0 g/L), and the carbon-to-nitrogen (C/N) ratio (16.53-22), were identified as optimal for VFA production. To enhance long-term monitoring and facilitate early warning systems, process indicators (pH, ORP, PNs, SCOD, and PSs) from the last day were used to predict VFA production on the following day by fine-tuning the generative pretrain transformer (GPT), with the gpt-3.5-turbo-0125 model exhibiting the highest R2 of 0.837 ± 0.004 and lowest RMSE of 296.98 ± 3.65 mg COD/L. Local sensitivity analysis revealed that SCOD was the most important process factor affecting VFA production. Moreover, this study employed ML models to uncover microbial biomarkers at the genus levels, including Prevotella_7, Veillonella, Megasphaera, and Lactobacillus, thereby elucidating the nexus among environmental factors, microbial communities, and VFA production. This study offered a novel modeling workflow for anaerobic cofermentation, enabling process optimization and mechanism exploration with the assistance of ML and large language models.
AB - Electrochemical enhancing anaerobic cofermentation of waste activated sludge and food waste to produce volatile fatty acids (VFAs) represents an innovative and promising approach. Despite its potential, optimizing system performance, providing early warnings, and identifying biomarkers remain challenging tasks due to the intricate interplay of numerous environmental variables and unclear dynamics of microbial interactions. This study first employed machine learning (ML) models including XGBoost, random forest (RF), support vector regression (SVR), and CatBoost to forecast VFA production by integrating initial feedstock properties, electrochemical pretreatment conditions, and fermentation parameters. CatBoost demonstrated the highest R2 of 0.977 and the lowest root-mean-square error (RMSE) at 95.69 mg COD/L. Key environmental factors, including fermentation days (VFA production reaching 90% by day 5), salinity (0.5-1.0 g/L), and the carbon-to-nitrogen (C/N) ratio (16.53-22), were identified as optimal for VFA production. To enhance long-term monitoring and facilitate early warning systems, process indicators (pH, ORP, PNs, SCOD, and PSs) from the last day were used to predict VFA production on the following day by fine-tuning the generative pretrain transformer (GPT), with the gpt-3.5-turbo-0125 model exhibiting the highest R2 of 0.837 ± 0.004 and lowest RMSE of 296.98 ± 3.65 mg COD/L. Local sensitivity analysis revealed that SCOD was the most important process factor affecting VFA production. Moreover, this study employed ML models to uncover microbial biomarkers at the genus levels, including Prevotella_7, Veillonella, Megasphaera, and Lactobacillus, thereby elucidating the nexus among environmental factors, microbial communities, and VFA production. This study offered a novel modeling workflow for anaerobic cofermentation, enabling process optimization and mechanism exploration with the assistance of ML and large language models.
KW - anaerobic cofermentation
KW - food waste
KW - generative pretrain transformer
KW - machine learning
KW - waste activated sludge
UR - https://www.scopus.com/pages/publications/85216629946
U2 - 10.1021/acsestengg.4c00830
DO - 10.1021/acsestengg.4c00830
M3 - 文章
AN - SCOPUS:85216629946
SN - 2690-0645
VL - 5
SP - 1149
EP - 1159
JO - ACS ES and T Engineering
JF - ACS ES and T Engineering
IS - 5
ER -