TY - JOUR
T1 - Current applications and future impact of machine learning in emerging contaminants
T2 - A review
AU - Lei, Lang
AU - Pang, Ruirui
AU - Han, Zhibang
AU - Wu, Dong
AU - Xie, Bing
AU - Su, Yinglong
N1 - Publisher Copyright:
© 2023 Taylor & Francis Group, LLC.
PY - 2023
Y1 - 2023
N2 - With the continuous release into environments, emerging contaminants (ECs) have attracted widespread attention for the potential risks, and numerous studies have been conducted on their identification, environmental behavior bioeffects, and removal. Owing to the superiority of dealing with high-dimensional and unstructured data, a new data-driven approach, machine learning (ML), has been gradually applied in the research of ECs. This review described the fundamental principle, algorithms, and workflow of ML, and summarized advances of ML applications for typical ECs (per- and polyfluoroalkyl substances, nanoparticles, antibiotic resistance genes, endocrine-disrupting chemicals, microplastics, antibiotics, and pharmaceutical and personal care products). ML methods showed practicability, reliability, and effectiveness in predicting or analyzing the occurrence, distribution, bioeffects, and removal of ECs, and various algorithms and derived models were developed and optimized to obtain better performance. Moreover, the size and homogeneity of the data set strongly influence the application of ML, and choosing the appropriate ML models with different characteristics is crucial for addressing specific problems related to the data sets. Future efforts should focus on improving the quality of data set and adopting more advanced algorithms, developing the potential of quantitative structure-activity relationship, and promoting the applicability domains and interpretability of models. In addition, the development of codeless ML tools will benefit the accessibility of ML models.
AB - With the continuous release into environments, emerging contaminants (ECs) have attracted widespread attention for the potential risks, and numerous studies have been conducted on their identification, environmental behavior bioeffects, and removal. Owing to the superiority of dealing with high-dimensional and unstructured data, a new data-driven approach, machine learning (ML), has been gradually applied in the research of ECs. This review described the fundamental principle, algorithms, and workflow of ML, and summarized advances of ML applications for typical ECs (per- and polyfluoroalkyl substances, nanoparticles, antibiotic resistance genes, endocrine-disrupting chemicals, microplastics, antibiotics, and pharmaceutical and personal care products). ML methods showed practicability, reliability, and effectiveness in predicting or analyzing the occurrence, distribution, bioeffects, and removal of ECs, and various algorithms and derived models were developed and optimized to obtain better performance. Moreover, the size and homogeneity of the data set strongly influence the application of ML, and choosing the appropriate ML models with different characteristics is crucial for addressing specific problems related to the data sets. Future efforts should focus on improving the quality of data set and adopting more advanced algorithms, developing the potential of quantitative structure-activity relationship, and promoting the applicability domains and interpretability of models. In addition, the development of codeless ML tools will benefit the accessibility of ML models.
KW - Bioeffects
KW - Frederic Coulon and Lena Q. Ma
KW - emerging contaminants
KW - environmental behavior
KW - identification
KW - machine learning
KW - removal technologies
UR - https://www.scopus.com/pages/publications/85150945391
U2 - 10.1080/10643389.2023.2190313
DO - 10.1080/10643389.2023.2190313
M3 - 文献综述
AN - SCOPUS:85150945391
SN - 1064-3389
VL - 53
SP - 1817
EP - 1835
JO - Critical Reviews in Environmental Science and Technology
JF - Critical Reviews in Environmental Science and Technology
IS - 20
ER -