DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics

Background: Due to scientific advancements in high-throughput data production technologies, omics studies, such as genomics and metabolomics, often give rise to numerous measurements per sample/subject containing several noisy variables that potentially cloud the true signals relevant to the desired...

Full description

Saved in:
Bibliographic Details
Main Authors: Debmalya Nandy, Debashis Ghosh, Katerina Kechris
Format: Article
Language:English
Published: MDPI AG 2025-01-01
Series:Metabolites
Subjects:
Online Access:https://www.mdpi.com/2218-1989/15/1/28
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832587975136903168
author Debmalya Nandy
Debashis Ghosh
Katerina Kechris
author_facet Debmalya Nandy
Debashis Ghosh
Katerina Kechris
author_sort Debmalya Nandy
collection DOAJ
description Background: Due to scientific advancements in high-throughput data production technologies, omics studies, such as genomics and metabolomics, often give rise to numerous measurements per sample/subject containing several noisy variables that potentially cloud the true signals relevant to the desired study outcome(s). Therefore, correcting for multiple testing is critical while performing any statistical test of significance to minimize the chances of false or missed discoveries. Such correction practice is commonplace in genome-wide association studies (GWAS) but is also becoming increasingly relevant to metabolome-wide association studies (MWAS). However, many existing procedures may be too conservative or too lenient, only assume a linear association between the features, or have not been evaluated on metabolomics data. Methods: One such multiple testing correction strategy is to estimate the number of statistically independent tests, called the <i>effective number of tests</i>, based on the eigen-analysis of the correlation matrix between the features. This effective number is then used for a subsequent single-step adjustment to obtain the pointwise significance level. We propose a modification to the <i>p</i>-value adjustment based on a more general measure of association between two predictors, the <i>distance correlation</i>, with a specific focus on MWAS. Results: We assessed common GWAS <i>p</i>-value adjustment procedures and one tailored for MWAS, which rely on eigen-analysis of the Pearson’s correlation matrix. Our study, including varying sample size-to-feature ratios, response types, and metabolite groupings, highlights the superior performance of the distance correlation. Conclusion: We propose the distance-correlation-based <i>p</i>-value adjustment (DisCo P-ad) as a novel modification that can enhance existing eigen-analysis-based multiple testing correction procedures by increasing power or reducing false positives. While our focus is on metabolomics, DisCo P-ad can also readily be applied to other high-dimensional omics studies.
format Article
id doaj-art-1dcb18f53f4d4b10b2844a030af46279
institution Kabale University
issn 2218-1989
language English
publishDate 2025-01-01
publisher MDPI AG
record_format Article
series Metabolites
spelling doaj-art-1dcb18f53f4d4b10b2844a030af462792025-01-24T13:41:13ZengMDPI AGMetabolites2218-19892025-01-011512810.3390/metabo15010028DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for MetabolomicsDebmalya Nandy0Debashis Ghosh1Katerina Kechris2Department of Biostatistics & Informatics, Colorado School of Public Heath, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USADepartment of Biostatistics & Informatics, Colorado School of Public Heath, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USADepartment of Biostatistics & Informatics, Colorado School of Public Heath, University of Colorado Anschutz Medical Campus, Aurora, CO 80045, USABackground: Due to scientific advancements in high-throughput data production technologies, omics studies, such as genomics and metabolomics, often give rise to numerous measurements per sample/subject containing several noisy variables that potentially cloud the true signals relevant to the desired study outcome(s). Therefore, correcting for multiple testing is critical while performing any statistical test of significance to minimize the chances of false or missed discoveries. Such correction practice is commonplace in genome-wide association studies (GWAS) but is also becoming increasingly relevant to metabolome-wide association studies (MWAS). However, many existing procedures may be too conservative or too lenient, only assume a linear association between the features, or have not been evaluated on metabolomics data. Methods: One such multiple testing correction strategy is to estimate the number of statistically independent tests, called the <i>effective number of tests</i>, based on the eigen-analysis of the correlation matrix between the features. This effective number is then used for a subsequent single-step adjustment to obtain the pointwise significance level. We propose a modification to the <i>p</i>-value adjustment based on a more general measure of association between two predictors, the <i>distance correlation</i>, with a specific focus on MWAS. Results: We assessed common GWAS <i>p</i>-value adjustment procedures and one tailored for MWAS, which rely on eigen-analysis of the Pearson’s correlation matrix. Our study, including varying sample size-to-feature ratios, response types, and metabolite groupings, highlights the superior performance of the distance correlation. Conclusion: We propose the distance-correlation-based <i>p</i>-value adjustment (DisCo P-ad) as a novel modification that can enhance existing eigen-analysis-based multiple testing correction procedures by increasing power or reducing false positives. While our focus is on metabolomics, DisCo P-ad can also readily be applied to other high-dimensional omics studies.https://www.mdpi.com/2218-1989/15/1/28multiple testingeffective number of testscorrelated testseigen-analysispointwise error ratemetabolome-wide association study
spellingShingle Debmalya Nandy
Debashis Ghosh
Katerina Kechris
DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
Metabolites
multiple testing
effective number of tests
correlated tests
eigen-analysis
pointwise error rate
metabolome-wide association study
title DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
title_full DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
title_fullStr DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
title_full_unstemmed DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
title_short DisCo P-ad: Distance-Correlation-Based <i><b>p</b></i>-Value Adjustment Enhances Multiple Testing Corrections for Metabolomics
title_sort disco p ad distance correlation based i b p b i value adjustment enhances multiple testing corrections for metabolomics
topic multiple testing
effective number of tests
correlated tests
eigen-analysis
pointwise error rate
metabolome-wide association study
url https://www.mdpi.com/2218-1989/15/1/28
work_keys_str_mv AT debmalyanandy discopaddistancecorrelationbasedibpbivalueadjustmentenhancesmultipletestingcorrectionsformetabolomics
AT debashisghosh discopaddistancecorrelationbasedibpbivalueadjustmentenhancesmultipletestingcorrectionsformetabolomics
AT katerinakechris discopaddistancecorrelationbasedibpbivalueadjustmentenhancesmultipletestingcorrectionsformetabolomics