Compositional data analysis enables statistical rigor in comparative glycomics
Abstract Comparative glycomics data are compositional data, where measured glycans are parts of a whole, indicated by relative abundances. Applying traditional statistical analyses to these data often results in misleading conclusions, such as spurious “decreases” of glycans when other structures in...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2025-01-01
|
Series: | Nature Communications |
Online Access: | https://doi.org/10.1038/s41467-025-56249-3 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832594615517052928 |
---|---|
author | Alexander R. Bennett Jon Lundstrøm Sayantani Chatterjee Morten Thaysen-Andersen Daniel Bojar |
author_facet | Alexander R. Bennett Jon Lundstrøm Sayantani Chatterjee Morten Thaysen-Andersen Daniel Bojar |
author_sort | Alexander R. Bennett |
collection | DOAJ |
description | Abstract Comparative glycomics data are compositional data, where measured glycans are parts of a whole, indicated by relative abundances. Applying traditional statistical analyses to these data often results in misleading conclusions, such as spurious “decreases” of glycans when other structures increase in abundance, or high false-positive rates for differential abundance. Our work introduces a compositional data analysis framework, tailored to comparative glycomics, to account for these data dependencies. We employ center log-ratio and additive log-ratio transformations, augmented with a scale uncertainty/information model, to introduce a statistically robust and sensitive data analysis pipeline. Applied to comparative glycomics datasets, including known glycan concentrations in defined mixtures, this approach controls false-positive rates and results in reproducible biological findings. Additionally, we present specialized analysis modalities: alpha- and beta-diversity analyze glycan distributions within and between samples, while cross-class glycan correlations shed light on previously undetected interdependencies. These approaches reveal insights into glycome variations that are critical to understanding roles of glycans in health and disease. |
format | Article |
id | doaj-art-456154471ba749b8a91fd10e01b07a69 |
institution | Kabale University |
issn | 2041-1723 |
language | English |
publishDate | 2025-01-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Nature Communications |
spelling | doaj-art-456154471ba749b8a91fd10e01b07a692025-01-19T12:30:14ZengNature PortfolioNature Communications2041-17232025-01-0116111510.1038/s41467-025-56249-3Compositional data analysis enables statistical rigor in comparative glycomicsAlexander R. Bennett0Jon Lundstrøm1Sayantani Chatterjee2Morten Thaysen-Andersen3Daniel Bojar4Department of Medical Biochemistry, Institute of Biomedicine, University of GothenburgDepartment of Chemistry and Molecular Biology, University of GothenburgSchool of Natural Sciences, Faculty of Science and Engineering, Macquarie UniversitySchool of Natural Sciences, Faculty of Science and Engineering, Macquarie UniversityDepartment of Chemistry and Molecular Biology, University of GothenburgAbstract Comparative glycomics data are compositional data, where measured glycans are parts of a whole, indicated by relative abundances. Applying traditional statistical analyses to these data often results in misleading conclusions, such as spurious “decreases” of glycans when other structures increase in abundance, or high false-positive rates for differential abundance. Our work introduces a compositional data analysis framework, tailored to comparative glycomics, to account for these data dependencies. We employ center log-ratio and additive log-ratio transformations, augmented with a scale uncertainty/information model, to introduce a statistically robust and sensitive data analysis pipeline. Applied to comparative glycomics datasets, including known glycan concentrations in defined mixtures, this approach controls false-positive rates and results in reproducible biological findings. Additionally, we present specialized analysis modalities: alpha- and beta-diversity analyze glycan distributions within and between samples, while cross-class glycan correlations shed light on previously undetected interdependencies. These approaches reveal insights into glycome variations that are critical to understanding roles of glycans in health and disease.https://doi.org/10.1038/s41467-025-56249-3 |
spellingShingle | Alexander R. Bennett Jon Lundstrøm Sayantani Chatterjee Morten Thaysen-Andersen Daniel Bojar Compositional data analysis enables statistical rigor in comparative glycomics Nature Communications |
title | Compositional data analysis enables statistical rigor in comparative glycomics |
title_full | Compositional data analysis enables statistical rigor in comparative glycomics |
title_fullStr | Compositional data analysis enables statistical rigor in comparative glycomics |
title_full_unstemmed | Compositional data analysis enables statistical rigor in comparative glycomics |
title_short | Compositional data analysis enables statistical rigor in comparative glycomics |
title_sort | compositional data analysis enables statistical rigor in comparative glycomics |
url | https://doi.org/10.1038/s41467-025-56249-3 |
work_keys_str_mv | AT alexanderrbennett compositionaldataanalysisenablesstatisticalrigorincomparativeglycomics AT jonlundstrøm compositionaldataanalysisenablesstatisticalrigorincomparativeglycomics AT sayantanichatterjee compositionaldataanalysisenablesstatisticalrigorincomparativeglycomics AT mortenthaysenandersen compositionaldataanalysisenablesstatisticalrigorincomparativeglycomics AT danielbojar compositionaldataanalysisenablesstatisticalrigorincomparativeglycomics |