Prediction of ammonia and total nitrogen in large freshwater lake watershed based on small sample data and analysis of their spatiotemporal variation and driving mechanism

  • Chengming Luo
  • , Xihua Wang*
  • , Y. Jun Xu
  • , Shunqing Jia
  • , Zejun Liu
  • , Boyang Mao
  • , Qinya Lv
  • , Xuming Ji
  • , Yanxin Rong
  • , Yan Dai
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Ammonia nitrogen (NH₃-N) and total nitrogen (TN) pollution pose serious threats to freshwater lake ecosystems, making accurate prediction essential for watershed management. However, limited and variable-quality data challenge the performance of existing prediction models. This study proposed an integrated prediction framework combining sample enhancement, adaptive feature selection, and multiple machine learning methods to improve NH₃-N and TN prediction in the Poyang Lake watershed. Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) was used to generate high-quality virtual samples, enhancing data availability. Recursive Feature Elimination (RFE) was then applied to identify key variables and remove redundancy, improving model efficiency. Four models, Long Short-Term Memory (LSTM), Bidirectional LSTM (Bi-LSTM), Gated Recurrent Unit (GRU), and Extreme Learning Machine, were used to construct prediction models and compared. Meanwhile, Spearman correlation analysis and principal component analysis methods were used to reveal the main sources of TN and NH₃-N pollution. Results showed clear spatiotemporal heterogeneity in NH₃-N and TN levels, with the Fuhe River sub-basin being the most polluted. Agricultural runoff, domestic sewage, and industrial discharge were identified as key pollution sources. WGAN-GP and RFE significantly improved model performance: the R2 of the best prediction model (GRU) for TN improved from 0.515 to 0.709 and the best prediction model (Bi-LSTM) for NH₃-N improved from 0.369 to 0.909. The deep learning models demonstrated similar predictive capabilities and could be integrated to enhance accuracy and stability. This study offers an effective, data-efficient approach for water quality prediction under small-sample conditions and provides scientific guidance for watershed environmental management.

Original languageEnglish
Article number107887
JournalProcess Safety and Environmental Protection
Volume203
DOIs
StatePublished - Nov 2025
Externally publishedYes

Keywords

  • Adaptive feature selection
  • Ammonia nitrogen
  • Multiple machine learning
  • Poyang lake watershed
  • Sample enhancement
  • Total nitrogen

Fingerprint

Dive into the research topics of 'Prediction of ammonia and total nitrogen in large freshwater lake watershed based on small sample data and analysis of their spatiotemporal variation and driving mechanism'. Together they form a unique fingerprint.

Cite this