Enabling data‐driven collaborative and reproducible environmental synthesis science

Abstract This manuscript shares the lessons learned from providing scientific computing support to over 600 researchers and discipline experts, helping them develop reproducible and scalable analytical workflows to process large amounts of heterogeneous data. When providing scientific computing supp...

Full description

Saved in:
Bibliographic Details
Main Authors: Julien Brun, Nicholas J. Lyon, Angel Chen, Ingrid Slette, Gabriel De La Rosa, Jennifer E. Caselle, Frank W. Davis, Martha R. Downs
Format: Article
Language:English
Published: Wiley 2025-06-01
Series:Methods in Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1111/2041-210X.70036
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849320554492854272
author Julien Brun
Nicholas J. Lyon
Angel Chen
Ingrid Slette
Gabriel De La Rosa
Jennifer E. Caselle
Frank W. Davis
Martha R. Downs
author_facet Julien Brun
Nicholas J. Lyon
Angel Chen
Ingrid Slette
Gabriel De La Rosa
Jennifer E. Caselle
Frank W. Davis
Martha R. Downs
author_sort Julien Brun
collection DOAJ
description Abstract This manuscript shares the lessons learned from providing scientific computing support to over 600 researchers and discipline experts, helping them develop reproducible and scalable analytical workflows to process large amounts of heterogeneous data. When providing scientific computing support, focus is first placed on how to foster the collaborative aspects of multidisciplinary projects on the technological side by providing virtual spaces to communicate and share documents. Then insights on data management planning and how to implement a centralized data management workflow for data‐driven projects are provided. Developing reproducible workflows requires the development of code. We describe tools and practices that have been successful in fostering collaborative coding and scaling on remote servers, enabling teams to iterate more efficiently. We have found short training sessions combined with on‐demand specialized support to be the most impactful combination in helping scientists develop their technical skills. Here we share our experiences in enabling researchers to do science more collaboratively and more reproducibly beyond any specific project, with long‐lasting effects on the way researchers conduct science. We hope that other groups supporting team‐ and data‐driven science (in environmental science and beyond) will benefit from the lessons we have learned over the years through trial and error.
format Article
id doaj-art-4e9998fc79c042f99b615bbe7d4e6a59
institution Kabale University
issn 2041-210X
language English
publishDate 2025-06-01
publisher Wiley
record_format Article
series Methods in Ecology and Evolution
spelling doaj-art-4e9998fc79c042f99b615bbe7d4e6a592025-08-20T03:50:02ZengWileyMethods in Ecology and Evolution2041-210X2025-06-011661061107410.1111/2041-210X.70036Enabling data‐driven collaborative and reproducible environmental synthesis scienceJulien Brun0Nicholas J. Lyon1Angel Chen2Ingrid Slette3Gabriel De La Rosa4Jennifer E. Caselle5Frank W. Davis6Martha R. Downs7Research Data Services, Library University of California Santa Barbara Santa Barbara California USANational Center for Ecological Analysis and Synthesis University of California Santa Barbara California USANational Center for Ecological Analysis and Synthesis University of California Santa Barbara California USADepartment of Ecology, Evolution, and Behavior University of Minnesota St Paul Minnesota USANational Center for Ecological Analysis and Synthesis University of California Santa Barbara California USAMarine Science Institute University of California Santa Barbara California USABren School of Environmental Science and Management University of California Santa Barbara California USANational Center for Ecological Analysis and Synthesis University of California Santa Barbara California USAAbstract This manuscript shares the lessons learned from providing scientific computing support to over 600 researchers and discipline experts, helping them develop reproducible and scalable analytical workflows to process large amounts of heterogeneous data. When providing scientific computing support, focus is first placed on how to foster the collaborative aspects of multidisciplinary projects on the technological side by providing virtual spaces to communicate and share documents. Then insights on data management planning and how to implement a centralized data management workflow for data‐driven projects are provided. Developing reproducible workflows requires the development of code. We describe tools and practices that have been successful in fostering collaborative coding and scaling on remote servers, enabling teams to iterate more efficiently. We have found short training sessions combined with on‐demand specialized support to be the most impactful combination in helping scientists develop their technical skills. Here we share our experiences in enabling researchers to do science more collaboratively and more reproducibly beyond any specific project, with long‐lasting effects on the way researchers conduct science. We hope that other groups supporting team‐ and data‐driven science (in environmental science and beyond) will benefit from the lessons we have learned over the years through trial and error.https://doi.org/10.1111/2041-210X.70036collaborative sciencedata managementenvironmental data sciencereproducible sciencesynthesis science
spellingShingle Julien Brun
Nicholas J. Lyon
Angel Chen
Ingrid Slette
Gabriel De La Rosa
Jennifer E. Caselle
Frank W. Davis
Martha R. Downs
Enabling data‐driven collaborative and reproducible environmental synthesis science
Methods in Ecology and Evolution
collaborative science
data management
environmental data science
reproducible science
synthesis science
title Enabling data‐driven collaborative and reproducible environmental synthesis science
title_full Enabling data‐driven collaborative and reproducible environmental synthesis science
title_fullStr Enabling data‐driven collaborative and reproducible environmental synthesis science
title_full_unstemmed Enabling data‐driven collaborative and reproducible environmental synthesis science
title_short Enabling data‐driven collaborative and reproducible environmental synthesis science
title_sort enabling data driven collaborative and reproducible environmental synthesis science
topic collaborative science
data management
environmental data science
reproducible science
synthesis science
url https://doi.org/10.1111/2041-210X.70036
work_keys_str_mv AT julienbrun enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT nicholasjlyon enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT angelchen enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT ingridslette enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT gabrieldelarosa enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT jenniferecaselle enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT frankwdavis enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience
AT marthardowns enablingdatadrivencollaborativeandreproducibleenvironmentalsynthesisscience