Risk factors affecting polygenic score performance across diverse cohorts

Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,61...

Full description

Saved in:
Bibliographic Details
Main Authors: Daniel Hui, Scott Dudek, Krzysztof Kiryluk, Theresa L Walunas, Iftikhar J Kullo, Wei-Qi Wei, Hemant Tiwari, Josh F Peterson, Wendy K Chung, Brittney H Davis, Atlas Khan, Leah C Kottyan, Nita A Limdi, Qiping Feng, Megan J Puckelwartz, Chunhua Weng, Johanna L Smith, Elizabeth W Karlson, Regeneron Genetics Center, Penn Medicine BioBank, Gail P Jarvik, Marylyn D Ritchie
Format: Article
Language:English
Published: eLife Sciences Publications Ltd 2025-01-01
Series:eLife
Subjects:
Online Access:https://elifesciences.org/articles/88149
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832584823247470592
author Daniel Hui
Scott Dudek
Krzysztof Kiryluk
Theresa L Walunas
Iftikhar J Kullo
Wei-Qi Wei
Hemant Tiwari
Josh F Peterson
Wendy K Chung
Brittney H Davis
Atlas Khan
Leah C Kottyan
Nita A Limdi
Qiping Feng
Megan J Puckelwartz
Chunhua Weng
Johanna L Smith
Elizabeth W Karlson
Regeneron Genetics Center
Penn Medicine BioBank
Gail P Jarvik
Marylyn D Ritchie
author_facet Daniel Hui
Scott Dudek
Krzysztof Kiryluk
Theresa L Walunas
Iftikhar J Kullo
Wei-Qi Wei
Hemant Tiwari
Josh F Peterson
Wendy K Chung
Brittney H Davis
Atlas Khan
Leah C Kottyan
Nita A Limdi
Qiping Feng
Megan J Puckelwartz
Chunhua Weng
Johanna L Smith
Elizabeth W Karlson
Regeneron Genetics Center
Penn Medicine BioBank
Gail P Jarvik
Marylyn D Ritchie
author_sort Daniel Hui
collection DOAJ
description Apart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best- and worst-performing quintiles for certain covariates. Twenty-eight covariates had significant PGSBMI–covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects – across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account nonlinear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge genome-wide association studies effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects.
format Article
id doaj-art-d4b463779d25453d8cd615c74bcb529a
institution Kabale University
issn 2050-084X
language English
publishDate 2025-01-01
publisher eLife Sciences Publications Ltd
record_format Article
series eLife
spelling doaj-art-d4b463779d25453d8cd615c74bcb529a2025-01-27T10:58:55ZengeLife Sciences Publications LtdeLife2050-084X2025-01-011210.7554/eLife.88149Risk factors affecting polygenic score performance across diverse cohortsDaniel Hui0https://orcid.org/0000-0002-8023-7352Scott Dudek1Krzysztof Kiryluk2Theresa L Walunas3Iftikhar J Kullo4Wei-Qi Wei5Hemant Tiwari6Josh F Peterson7Wendy K Chung8Brittney H Davis9Atlas Khan10Leah C Kottyan11Nita A Limdi12Qiping Feng13https://orcid.org/0000-0002-6213-793XMegan J Puckelwartz14Chunhua Weng15Johanna L Smith16Elizabeth W Karlson17Regeneron Genetics CenterPenn Medicine BioBankGail P Jarvik18Marylyn D Ritchie19https://orcid.org/0000-0002-1208-1720Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesDepartment of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesDivision of Nephrology, Department of Medicine, Columbia University, New York, United StatesDepartment of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, United StatesDepartment of Cardiovascular Medicine, Mayo Clinic, Rochester, United StatesDepartment of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, United StatesDepartment of Pediatrics, University of Alabama at Birmingham, Birmingham, United StatesDepartment of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, United StatesDepartments of Pediatrics and Medicine, Columbia University Irving Medical Center, Columbia University, New York, United StatesDepartment of Neurology, School of Medicine, University of Alabama at Birmingham, Birmingham, United StatesDivision of Nephrology, Department of Medicine, Columbia University, New York, United StatesThe Center for Autoimmune Genomics and Etiology, Division of Human Genetics, Cincinnati Children's Hospital Medical Center, Cincinnati, United StatesDepartment of Neurology, School of Medicine, University of Alabama at Birmingham, Birmingham, United StatesDivision of Clinical Pharmacology, Department of Medicine, Vanderbilt University Medical Center, Nashville, United StatesCenter for Genetic Medicine, Northwestern University Feinberg School of Medicine, Chicago, United StatesDepartment of Biomedical Informatics, Vagelos College of Physicians & Surgeons, Columbia University, New York, United StatesDepartment of Cardiovascular Medicine, Mayo Clinic, Rochester, United StatesDivision of Rheumatology, Inflammation, and Immunity, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, United StatesDepartments of Medicine (Medical Genetics) and Genome Sciences, University of Washington Medical Center, Seattle, United StatesDepartment of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, United StatesApart from ancestry, personal or environmental covariates may contribute to differences in polygenic score (PGS) performance. We analyzed the effects of covariate stratification and interaction on body mass index (BMI) PGS (PGSBMI) across four cohorts of European (N = 491,111) and African (N = 21,612) ancestry. Stratifying on binary covariates and quintiles for continuous covariates, 18/62 covariates had significant and replicable R2 differences among strata. Covariates with the largest differences included age, sex, blood lipids, physical activity, and alcohol consumption, with R2 being nearly double between best- and worst-performing quintiles for certain covariates. Twenty-eight covariates had significant PGSBMI–covariate interaction effects, modifying PGSBMI effects by nearly 20% per standard deviation change. We observed overlap between covariates that had significant R2 differences among strata and interaction effects – across all covariates, their main effects on BMI were correlated with their maximum R2 differences and interaction effects (0.56 and 0.58, respectively), suggesting high-PGSBMI individuals have highest R2 and increase in PGS effect. Using quantile regression, we show the effect of PGSBMI increases as BMI itself increases, and that these differences in effects are directly related to differences in R2 when stratifying by different covariates. Given significant and replicable evidence for context-specific PGSBMI performance and effects, we investigated ways to increase model performance taking into account nonlinear effects. Machine learning models (neural networks) increased relative model R2 (mean 23%) across datasets. Finally, creating PGSBMI directly from GxAge genome-wide association studies effects increased relative R2 by 7.8%. These results demonstrate that certain covariates, especially those most associated with BMI, significantly affect both PGSBMI performance and effects across diverse cohorts and ancestries, and we provide avenues to improve model performance that consider these effects.https://elifesciences.org/articles/88149polygenic scoresbody mass indexgene–environmentGWAS
spellingShingle Daniel Hui
Scott Dudek
Krzysztof Kiryluk
Theresa L Walunas
Iftikhar J Kullo
Wei-Qi Wei
Hemant Tiwari
Josh F Peterson
Wendy K Chung
Brittney H Davis
Atlas Khan
Leah C Kottyan
Nita A Limdi
Qiping Feng
Megan J Puckelwartz
Chunhua Weng
Johanna L Smith
Elizabeth W Karlson
Regeneron Genetics Center
Penn Medicine BioBank
Gail P Jarvik
Marylyn D Ritchie
Risk factors affecting polygenic score performance across diverse cohorts
eLife
polygenic scores
body mass index
gene–environment
GWAS
title Risk factors affecting polygenic score performance across diverse cohorts
title_full Risk factors affecting polygenic score performance across diverse cohorts
title_fullStr Risk factors affecting polygenic score performance across diverse cohorts
title_full_unstemmed Risk factors affecting polygenic score performance across diverse cohorts
title_short Risk factors affecting polygenic score performance across diverse cohorts
title_sort risk factors affecting polygenic score performance across diverse cohorts
topic polygenic scores
body mass index
gene–environment
GWAS
url https://elifesciences.org/articles/88149
work_keys_str_mv AT danielhui riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT scottdudek riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT krzysztofkiryluk riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT theresalwalunas riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT iftikharjkullo riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT weiqiwei riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT hemanttiwari riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT joshfpeterson riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT wendykchung riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT brittneyhdavis riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT atlaskhan riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT leahckottyan riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT nitaalimdi riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT qipingfeng riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT meganjpuckelwartz riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT chunhuaweng riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT johannalsmith riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT elizabethwkarlson riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT regenerongeneticscenter riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT pennmedicinebiobank riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT gailpjarvik riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts
AT marylyndritchie riskfactorsaffectingpolygenicscoreperformanceacrossdiversecohorts