Interpretation of COVID-19 Epidemiological Trends in Mexico Through Wastewater Surveillance Using Simple Machine Learning Algorithms for Rapid Decision-Making

Detection and quantification of disease-related biomarkers in wastewater samples, denominated Wastewater-based Surveillance (WBS), has proven a valuable strategy for studying the prevalence of infectious diseases within populations in a time- and resource-efficient manner, as wastewater samples are...

Full description

Saved in:
Bibliographic Details
Main Authors: Arnoldo Armenta-Castro, Orlando de la Rosa, Alberto Aguayo-Acosta, Mariel Araceli Oyervides-Muñoz, Antonio Flores-Tlacuahuac, Roberto Parra-Saldívar, Juan Eduardo Sosa-Hernández
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Viruses
Subjects:
Online Access:https://www.mdpi.com/1999-4915/17/1/109
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Detection and quantification of disease-related biomarkers in wastewater samples, denominated Wastewater-based Surveillance (WBS), has proven a valuable strategy for studying the prevalence of infectious diseases within populations in a time- and resource-efficient manner, as wastewater samples are representative of all cases within the catchment area, whether they are clinically reported or not. However, analysis and interpretation of WBS datasets for decision-making during public health emergencies, such as the COVID-19 pandemic, remains an area of opportunity. In this article, a database obtained from wastewater sampling at wastewater treatment plants (WWTPs) and university campuses in Monterrey and Mexico City between 2021 and 2022 was used to train simple clustering- and regression-based risk assessment models to allow for informed prevention and control measures in high-affluence facilities, even if working with low-dimensionality datasets and a limited number of observations. When dividing weekly data points based on whether the seven-day average daily new COVID-19 cases were above a certain threshold, the resulting clustering model could differentiate between weeks with surges in clinical reports and periods between them with an 87.9% accuracy rate. Moreover, the clustering model provided satisfactory forecasts one week (80.4% accuracy) and two weeks (81.8%) into the future. However, the prediction of the weekly average of new daily cases was limited (R<sup>2</sup> = 0.80, MAPE = 72.6%), likely because of insufficient dimensionality in the database. Overall, while simple, WBS-supported models can provide relevant insights for decision-makers during epidemiological outbreaks, regression algorithms for prediction using low-dimensionality datasets can still be improved.
ISSN:1999-4915