iMESc – an interactive machine learning app for environmental sciences

As environmental sciences increasingly rely on complex datasets, machine learning (ML) has become crucial for identifying patterns and relationships. However, the integration of ML into workflows can pose challenges due to technical barriers or the time-intensive nature of coding. To address these i...

Full description

Saved in:
Bibliographic Details
Main Authors: Danilo Cândido Vieira, Fabiana S. Paula, Luciana Erika Yaginuma, Gustavo Fonseca
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-01-01
Series:Frontiers in Environmental Science
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fenvs.2025.1533292/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832576331671404544
author Danilo Cândido Vieira
Danilo Cândido Vieira
Fabiana S. Paula
Luciana Erika Yaginuma
Gustavo Fonseca
author_facet Danilo Cândido Vieira
Danilo Cândido Vieira
Fabiana S. Paula
Luciana Erika Yaginuma
Gustavo Fonseca
author_sort Danilo Cândido Vieira
collection DOAJ
description As environmental sciences increasingly rely on complex datasets, machine learning (ML) has become crucial for identifying patterns and relationships. However, the integration of ML into workflows can pose challenges due to technical barriers or the time-intensive nature of coding. To address these issues, we developed iMESc, an interactive ML app designed to streamline and simplify ML workflows for environmental data. Developed in R and built on the Shiny platform, iMESc enables the integration of supervised and unsupervised ML methods, along with tools for data preprocessing, visualization, descriptive statistics, and spatial analysis. The Datalist system ensures seamless transitions between analytical workflows, while the “savepoints” feature enhances reproducibility by preserving the analysis state. We demonstrate iMESc’s flexibility with four workflows applied to a case study predicting nematode community structure based on environmental data. The classical statistical approaches, the Redundancy Analysis (RDA) and Piecewise RDA (pwRDA), explained 30.7% and 53%, respectively. The SuperSOM model achieved an R2 of 0.60 for training and 0.291 for testing, identifying spatial patterns across depth zones. Finally, a hybrid model combining an unsupervised SOM and followed by the supervised Random Forest model returned an accuracy of 83.47% for the training and 80.77% for the test, with Bathymetry, Chlorophyll, and Coarse Sand as key predictive variables. IMESc permits the customization of plots and saving the workflows into “savepoints” guarantying reproducibility. iMESc bridges the gap between the complexity of machine learning algorithms and the need for user-friendly interfaces in environmental research. By reducing the technical burden of coding, iMESc allows researchers to focus on scientific inquiry, improving both the efficiency and depth of their analyses.
format Article
id doaj-art-44e1cc5e9b9e44fd9459fb1bcc25d3bf
institution Kabale University
issn 2296-665X
language English
publishDate 2025-01-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Environmental Science
spelling doaj-art-44e1cc5e9b9e44fd9459fb1bcc25d3bf2025-01-31T06:39:55ZengFrontiers Media S.A.Frontiers in Environmental Science2296-665X2025-01-011310.3389/fenvs.2025.15332921533292iMESc – an interactive machine learning app for environmental sciencesDanilo Cândido Vieira0Danilo Cândido Vieira1Fabiana S. Paula2Luciana Erika Yaginuma3Gustavo Fonseca4Instituto do Mar, Campus Baixada Santista, Universidade Federal de São Paulo, Santos, BrazilInstituto Oceanográfico, Universidade de São Paulo, São Paulo, BrazilInstituto do Mar, Campus Baixada Santista, Universidade Federal de São Paulo, Santos, BrazilInstituto Oceanográfico, Universidade de São Paulo, São Paulo, BrazilInstituto do Mar, Campus Baixada Santista, Universidade Federal de São Paulo, Santos, BrazilAs environmental sciences increasingly rely on complex datasets, machine learning (ML) has become crucial for identifying patterns and relationships. However, the integration of ML into workflows can pose challenges due to technical barriers or the time-intensive nature of coding. To address these issues, we developed iMESc, an interactive ML app designed to streamline and simplify ML workflows for environmental data. Developed in R and built on the Shiny platform, iMESc enables the integration of supervised and unsupervised ML methods, along with tools for data preprocessing, visualization, descriptive statistics, and spatial analysis. The Datalist system ensures seamless transitions between analytical workflows, while the “savepoints” feature enhances reproducibility by preserving the analysis state. We demonstrate iMESc’s flexibility with four workflows applied to a case study predicting nematode community structure based on environmental data. The classical statistical approaches, the Redundancy Analysis (RDA) and Piecewise RDA (pwRDA), explained 30.7% and 53%, respectively. The SuperSOM model achieved an R2 of 0.60 for training and 0.291 for testing, identifying spatial patterns across depth zones. Finally, a hybrid model combining an unsupervised SOM and followed by the supervised Random Forest model returned an accuracy of 83.47% for the training and 80.77% for the test, with Bathymetry, Chlorophyll, and Coarse Sand as key predictive variables. IMESc permits the customization of plots and saving the workflows into “savepoints” guarantying reproducibility. iMESc bridges the gap between the complexity of machine learning algorithms and the need for user-friendly interfaces in environmental research. By reducing the technical burden of coding, iMESc allows researchers to focus on scientific inquiry, improving both the efficiency and depth of their analyses.https://www.frontiersin.org/articles/10.3389/fenvs.2025.1533292/fullshinymachine-learningsupervisedunsupervisedenvironmental sciencesanalytical workflow
spellingShingle Danilo Cândido Vieira
Danilo Cândido Vieira
Fabiana S. Paula
Luciana Erika Yaginuma
Gustavo Fonseca
iMESc – an interactive machine learning app for environmental sciences
Frontiers in Environmental Science
shiny
machine-learning
supervised
unsupervised
environmental sciences
analytical workflow
title iMESc – an interactive machine learning app for environmental sciences
title_full iMESc – an interactive machine learning app for environmental sciences
title_fullStr iMESc – an interactive machine learning app for environmental sciences
title_full_unstemmed iMESc – an interactive machine learning app for environmental sciences
title_short iMESc – an interactive machine learning app for environmental sciences
title_sort imesc an interactive machine learning app for environmental sciences
topic shiny
machine-learning
supervised
unsupervised
environmental sciences
analytical workflow
url https://www.frontiersin.org/articles/10.3389/fenvs.2025.1533292/full
work_keys_str_mv AT danilocandidovieira imescaninteractivemachinelearningappforenvironmentalsciences
AT danilocandidovieira imescaninteractivemachinelearningappforenvironmentalsciences
AT fabianaspaula imescaninteractivemachinelearningappforenvironmentalsciences
AT lucianaerikayaginuma imescaninteractivemachinelearningappforenvironmentalsciences
AT gustavofonseca imescaninteractivemachinelearningappforenvironmentalsciences