Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment

Abstract Perikinetic and orthokinetic flocculation are the first steps in drinking water treatment plant (DWTP) and affect all subsequent processes. Leveraging multi-stage water quality parameters, we developed a machine learning (ML) framework for coagulation control that incorporates knowledge emb...

Full description

Saved in:
Bibliographic Details
Main Authors: Yu-Qi Wang, Wenchong Tian, Hao-Lin Yang, Yun-Peng Song, Jia-Ji Chen, Qiong-Ying Xu, Wan-Xin Yin, Le-Qi Ding, Xi-Qi Li, Han-Tao Wang, Ai-Jie Wang, Hong-Cheng Wang
Format: Article
Language:English
Published: Nature Portfolio 2025-08-01
Series:npj Clean Water
Online Access:https://doi.org/10.1038/s41545-025-00510-1
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849226681237110784
author Yu-Qi Wang
Wenchong Tian
Hao-Lin Yang
Yun-Peng Song
Jia-Ji Chen
Qiong-Ying Xu
Wan-Xin Yin
Le-Qi Ding
Xi-Qi Li
Han-Tao Wang
Ai-Jie Wang
Hong-Cheng Wang
author_facet Yu-Qi Wang
Wenchong Tian
Hao-Lin Yang
Yun-Peng Song
Jia-Ji Chen
Qiong-Ying Xu
Wan-Xin Yin
Le-Qi Ding
Xi-Qi Li
Han-Tao Wang
Ai-Jie Wang
Hong-Cheng Wang
author_sort Yu-Qi Wang
collection DOAJ
description Abstract Perikinetic and orthokinetic flocculation are the first steps in drinking water treatment plant (DWTP) and affect all subsequent processes. Leveraging multi-stage water quality parameters, we developed a machine learning (ML) framework for coagulation control that incorporates knowledge embedding (KE) through hyper-parametric constraints on threshold water quality, energy consumption, and economic costs. Random forest (RF) has the best performance among the eight methods with a percentage error of 2.53% and a coefficient of determination of 0.9922. The results of the interpretability analysis show that the model can accurately identify the coagulation demand and balance the removal effect with the energy consumption and economic cost. Through real experimental validation and simulation extrapolation, the RF-KE model can reduce turbidity by 16.36% and dosing cost by 9.64%. This framework reduces economic costs while optimizing water quality through KE and interpretability analyses, providing evidence for the safe and reliable application of future models.
format Article
id doaj-art-d771f362f6a84cf1b86836aae486fe6c
institution Kabale University
issn 2059-7037
language English
publishDate 2025-08-01
publisher Nature Portfolio
record_format Article
series npj Clean Water
spelling doaj-art-d771f362f6a84cf1b86836aae486fe6c2025-08-24T11:05:45ZengNature Portfolionpj Clean Water2059-70372025-08-018111010.1038/s41545-025-00510-1Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatmentYu-Qi Wang0Wenchong Tian1Hao-Lin Yang2Yun-Peng Song3Jia-Ji Chen4Qiong-Ying Xu5Wan-Xin Yin6Le-Qi Ding7Xi-Qi Li8Han-Tao Wang9Ai-Jie Wang10Hong-Cheng Wang11State Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologySchool of Energy and Environment, City University of Hong KongState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyPowerChina Eco- environmental Group Co., LtdState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyState Key Laboratory of Urban-rural Water Resource and Environment, School of Eco-Environment, Harbin Institute of TechnologyAbstract Perikinetic and orthokinetic flocculation are the first steps in drinking water treatment plant (DWTP) and affect all subsequent processes. Leveraging multi-stage water quality parameters, we developed a machine learning (ML) framework for coagulation control that incorporates knowledge embedding (KE) through hyper-parametric constraints on threshold water quality, energy consumption, and economic costs. Random forest (RF) has the best performance among the eight methods with a percentage error of 2.53% and a coefficient of determination of 0.9922. The results of the interpretability analysis show that the model can accurately identify the coagulation demand and balance the removal effect with the energy consumption and economic cost. Through real experimental validation and simulation extrapolation, the RF-KE model can reduce turbidity by 16.36% and dosing cost by 9.64%. This framework reduces economic costs while optimizing water quality through KE and interpretability analyses, providing evidence for the safe and reliable application of future models.https://doi.org/10.1038/s41545-025-00510-1
spellingShingle Yu-Qi Wang
Wenchong Tian
Hao-Lin Yang
Yun-Peng Song
Jia-Ji Chen
Qiong-Ying Xu
Wan-Xin Yin
Le-Qi Ding
Xi-Qi Li
Han-Tao Wang
Ai-Jie Wang
Hong-Cheng Wang
Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
npj Clean Water
title Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
title_full Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
title_fullStr Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
title_full_unstemmed Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
title_short Knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
title_sort knowledge embedding and interpretable machine learning optimize comprehensive benefits for water treatment
url https://doi.org/10.1038/s41545-025-00510-1
work_keys_str_mv AT yuqiwang knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT wenchongtian knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT haolinyang knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT yunpengsong knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT jiajichen knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT qiongyingxu knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT wanxinyin knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT leqiding knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT xiqili knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT hantaowang knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT aijiewang knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment
AT hongchengwang knowledgeembeddingandinterpretablemachinelearningoptimizecomprehensivebenefitsforwatertreatment