Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the...

Full description

Saved in:
Bibliographic Details
Main Authors: Chanwoo Park, Nan Jiang, Taesung Park
Format: Article
Language:English
Published: BioMed Central 2019-12-01
Series:Genomics & Informatics
Subjects:
Online Access:http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832574158092894208
author Chanwoo Park
Nan Jiang
Taesung Park
author_facet Chanwoo Park
Nan Jiang
Taesung Park
author_sort Chanwoo Park
collection DOAJ
description The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.
format Article
id doaj-art-4302f0aa30bf4fc3b23c2741b9dc489b
institution Kabale University
issn 2234-0742
language English
publishDate 2019-12-01
publisher BioMed Central
record_format Article
series Genomics & Informatics
spelling doaj-art-4302f0aa30bf4fc3b23c2741b9dc489b2025-02-02T00:38:35ZengBioMed CentralGenomics & Informatics2234-07422019-12-0117410.5808/GI.2019.17.4.e47591Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetesChanwoo Park0Nan Jiang1Taesung Park2 Department of Statistics, Seoul National University, Seoul 08826, Korea Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea Department of Statistics, Seoul National University, Seoul 08826, KoreaThe achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdfgenome-wide association studypenalized regression modelpropensity scoretype 2 diabetes
spellingShingle Chanwoo Park
Nan Jiang
Taesung Park
Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
Genomics & Informatics
genome-wide association study
penalized regression model
propensity score
type 2 diabetes
title Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_full Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_fullStr Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_full_unstemmed Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_short Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_sort pure additive contribution of genetic variants to a risk prediction model using propensity score matching application to type 2 diabetes
topic genome-wide association study
penalized regression model
propensity score
type 2 diabetes
url http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdf
work_keys_str_mv AT chanwoopark pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes
AT nanjiang pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes
AT taesungpark pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes