Risk factors affecting polygenic score performance across diverse cohorts
Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,61...
Saved in:
Main Authors: | , , , , , , , , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
eLife Sciences Publications Ltd
2025-01-01
|
Series: | eLife |
Subjects: | |
Online Access: | https://elifesciences.org/articles/88149 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832584823247470592 |
---|---|
author | Daniel Hui Scott Dudek Krzysztof Kiryluk Theresa L Walunas Iftikhar J Kullo Wei-Qi Wei Hemant Tiwari Josh F Peterson Wendy K Chung Brittney H Davis Atlas Khan Leah C Kottyan Nita A Limdi Qiping Feng Megan J Puckelwartz Chunhua Weng Johanna L Smith Elizabeth W Karlson Regeneron Genetics Center Penn Medicine BioBank Gail P Jarvik Marylyn D Ritchie |
author_facet | Daniel Hui Scott Dudek Krzysztof Kiryluk Theresa L Walunas Iftikhar J Kullo Wei-Qi Wei Hemant Tiwari Josh F Peterson Wendy K Chung Brittney H Davis Atlas Khan Leah C Kottyan Nita A Limdi Qiping Feng Megan J Puckelwartz Chunhua Weng Johanna L Smith Elizabeth W Karlson Regeneron Genetics Center Penn Medicine BioBank Gail P Jarvik Marylyn D Ritchie |
author_sort | Daniel Hui |
collection | DOAJ |
description | Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best- and worst-performing quintiles for certain covariates. Twenty-eight covariates had significant PGSBMI–covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects – across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account nonlinear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge genome-wide association studies effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects. |
format | Article |
id | doaj-art-d4b463779d25453d8cd615c74bcb529a |
institution | Kabale University |
issn | 2050-084X |
language | English |
publishDate | 2025-01-01 |
publisher | eLife Sciences Publications Ltd |
record_format | Article |
series | eLife |
spelling | doaj-art-d4b463779d25453d8cd615c74bcb529a2025-01-27T10:58:55ZengeLife Sciences Publications LtdeLife2050-084X2025-01-011210.7554/eLife.88149Risk factors affecting polygenic score performance across diverse cohortsDaniel Hui0https://orcid.org/0000-0002-8023-7352Scott Dudek1Krzysztof Kiryluk2Theresa L Walunas3Iftikhar J Kullo4Wei-Qi Wei5Hemant Tiwari6Josh F Peterson7Wendy K Chung8Brittney H Davis9Atlas Khan10Leah C Kottyan11Nita A Limdi12Qiping Feng13https://orcid.org/0000-0002-6213-793XMegan J Puckelwartz14Chunhua Weng15Johanna L Smith16Elizabeth W Karlson17Regeneron Genetics CenterPenn Medicine BioBankGail P Jarvik18Marylyn D Ritchie19https://orcid.org/0000-0002-1208-1720Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesDepartment of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesDivision of Nephrology, Department of Medicine, Columbia University, New York, United StatesDepartment of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, United StatesDepartment of Cardiovascular Medicine, Mayo Clinic, Rochester, United StatesDepartment of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, United StatesDepartment of Pediatrics, University of Alabama at Birmingham, Birmingham, United StatesDepartment of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, United StatesDepartments of Pediatrics and Medicine, Columbia University Irving Medical Center, Columbia University, New York, United StatesDepartment of Neurology, School of Medicine, University of Alabama at Birmingham, Birmingham, United StatesDivision of Nephrology, Department of Medicine, Columbia University, New York, United StatesThe Center for Autoimmune Genomics and Etiology, Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, United StatesDepartment of Neurology, School of Medicine, University of Alabama at Birmingham, Birmingham, United StatesDivision of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, United StatesCenter for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, United StatesDepartment of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, United StatesDepartment of Cardiovascular Medicine, Mayo Clinic, Rochester, United StatesDivision of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, United StatesDepartments of Medicine (Medical Genetics) and Genome Sciences, University of Washington Medical Center, Seattle, United StatesDepartment of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesApart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best- and worst-performing quintiles for certain covariates. Twenty-eight covariates had significant PGSBMI–covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects – across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account nonlinear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge genome-wide association studies effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects.https://elifesciences.org/articles/88149polygenic scoresbody mass indexgene–environmentGWAS |
spellingShingle | Daniel Hui Scott Dudek Krzysztof Kiryluk Theresa L Walunas Iftikhar J Kullo Wei-Qi Wei Hemant Tiwari Josh F Peterson Wendy K Chung Brittney H Davis Atlas Khan Leah C Kottyan Nita A Limdi Qiping Feng Megan J Puckelwartz Chunhua Weng Johanna L Smith Elizabeth W Karlson Regeneron Genetics Center Penn Medicine BioBank Gail P Jarvik Marylyn D Ritchie Risk factors affecting polygenic score performance across diverse cohorts eLife polygenic scores body mass index gene–environment GWAS |
title | Risk factors affecting polygenic score performance across diverse cohorts |
title_full | Risk factors affecting polygenic score performance across diverse cohorts |
title_fullStr | Risk factors affecting polygenic score performance across diverse cohorts |
title_full_unstemmed | Risk factors affecting polygenic score performance across diverse cohorts |
title_short | Risk factors affecting polygenic score performance across diverse cohorts |
title_sort | risk factors affecting polygenic score performance across diverse cohorts |
topic | polygenic scores body mass index gene–environment GWAS |
url | https://elifesciences.org/articles/88149 |
work_keys_str_mv | AT danielhui riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT scottdudek riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT krzysztofkiryluk riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT theresalwalunas riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT iftikharjkullo riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT weiqiwei riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT hemanttiwari riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT joshfpeterson riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT wendykchung riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT brittneyhdavis riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT atlaskhan riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT leahckottyan riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT nitaalimdi riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT qipingfeng riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT meganjpuckelwartz riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT chunhuaweng riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT johannalsmith riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT elizabethwkarlson riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT regenerongeneticscenter riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT pennmedicinebiobank riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT gailpjarvik riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts AT marylyndritchie riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts |