TY - JOUR
T1 - Concentration division for adsorption coefficient prediction using machine learning with Abraham descriptors
T2 - Data-splitting approach comparison and critical factors identification
AU - Qi, Zhenguo
AU - Zhong, Shifa
AU - Huang, Xin
AU - Xu, Yucui
AU - Zhang, Haoze
AU - Shi, Baoyou
N1 - Publisher Copyright:
© 2024 Elsevier Ltd
PY - 2024/11
Y1 - 2024/11
N2 - Machine learning (ML) including Abraham descriptors from polyparameter linear free energy relationships (pp-LFERs) has been a popular method for the adsorption coefficient (Kd) prediction. However, Abraham descriptors from pp-LFERs are concentration-dependent and the significance of these descriptors can change over different adsorbate concentrations. Ignoring concentration effects on the adsorption process and Kd prediction will hinder the understanding of interactions among solutes, solvents and adsorbents at different equilibrium concentration (Ce) ranges. Therefore, our study first systematically investigated the concentration effects on micropollutant adsorption to carbon-based adsorbents using ML with Abraham descriptors. Concentration-selection approach, as a new data-splitting approach, divided the whole dataset according to the different Ce ranges. This concentration-selection approach performed better than the data-splitting approach used in previous studies. After the ML models were built in different Ce subsets, Shapley values were calculated to quantify input descriptor contributions. The results indicated specific surface area (BET) was the only critical factor when Ce was in the highest range. The importance of Abraham descriptors increased gradually when Ce decreased. Total pore volume (Vt) was a far less important feature than BET for Kd prediction. Critical factors identified at different Ce ranges for Kd prediction provide a guidance for novel carbon-based adsorbent design.
AB - Machine learning (ML) including Abraham descriptors from polyparameter linear free energy relationships (pp-LFERs) has been a popular method for the adsorption coefficient (Kd) prediction. However, Abraham descriptors from pp-LFERs are concentration-dependent and the significance of these descriptors can change over different adsorbate concentrations. Ignoring concentration effects on the adsorption process and Kd prediction will hinder the understanding of interactions among solutes, solvents and adsorbents at different equilibrium concentration (Ce) ranges. Therefore, our study first systematically investigated the concentration effects on micropollutant adsorption to carbon-based adsorbents using ML with Abraham descriptors. Concentration-selection approach, as a new data-splitting approach, divided the whole dataset according to the different Ce ranges. This concentration-selection approach performed better than the data-splitting approach used in previous studies. After the ML models were built in different Ce subsets, Shapley values were calculated to quantify input descriptor contributions. The results indicated specific surface area (BET) was the only critical factor when Ce was in the highest range. The importance of Abraham descriptors increased gradually when Ce decreased. Total pore volume (Vt) was a far less important feature than BET for Kd prediction. Critical factors identified at different Ce ranges for Kd prediction provide a guidance for novel carbon-based adsorbent design.
KW - Abraham descriptors
KW - Adsorption coefficient
KW - Carbon-based adsorbent design
KW - Equilibrium concentration
KW - Machine learning
UR - https://www.scopus.com/pages/publications/85202516054
U2 - 10.1016/j.carbon.2024.119573
DO - 10.1016/j.carbon.2024.119573
M3 - 文章
AN - SCOPUS:85202516054
SN - 0008-6223
VL - 230
JO - Carbon
JF - Carbon
M1 - 119573
ER -