Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the...

Full description

Saved in:

Bibliographic Details
Main Authors:	Chanwoo Park, Nan Jiang, Taesung Park
Format:	Article
Language:	English
Published:	BioMed Central 2019-12-01
Series:	Genomics & Informatics
Subjects:	genome-wide association study penalized regression model propensity score type 2 diabetes
Online Access:	http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832574158092894208
author	Chanwoo Park Nan Jiang Taesung Park
author_facet	Chanwoo Park Nan Jiang Taesung Park
author_sort	Chanwoo Park
collection	DOAJ
description	The achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.
format	Article
id	doaj-art-4302f0aa30bf4fc3b23c2741b9dc489b
institution	Kabale University
issn	2234-0742
language	English
publishDate	2019-12-01
publisher	BioMed Central
record_format	Article
series	Genomics & Informatics
spelling	doaj-art-4302f0aa30bf4fc3b23c2741b9dc489b2025-02-02T00:38:35ZengBioMed CentralGenomics & Informatics2234-07422019-12-0117410.5808/GI.2019.17.4.e47591Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetesChanwoo Park0Nan Jiang1Taesung Park2 Department of Statistics, Seoul National University, Seoul 08826, Korea Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea Department of Statistics, Seoul National University, Seoul 08826, KoreaThe achievements of genome-wide association studies have suggested ways to predict diseases, such as type 2 diabetes (T2D), using single-nucleotide polymorphisms (SNPs). Most T2D risk prediction models have used SNPs in combination with demographic variables. However, it is difficult to evaluate the pure additive contribution of genetic variants to classically used demographic models. Since prediction models include some heritable traits, such as body mass index, the contribution of SNPs using unmatched case-control samples may be underestimated. In this article, we propose a method that uses propensity score matching to avoid underestimation by matching case and control samples, thereby determining the pure additive contribution of SNPs. To illustrate the proposed propensity score matching method, we used SNP data from the Korea Association Resources project and reported SNPs from the genome-wide association study catalog. We selected various SNP sets via stepwise logistic regression (SLR), least absolute shrinkage and selection operator (LASSO), and the elastic-net (EN) algorithm. Using these SNP sets, we made predictions using SLR, LASSO, and EN as logistic regression modeling techniques. The accuracy of the predictions was compared in terms of area under the receiver operating characteristic curve (AUC). The contribution of SNPs to T2D was evaluated by the difference in the AUC between models using only demographic variables and models that included the SNPs. The largest difference among our models showed that the AUC of the model using genetic variants with demographic variables could be 0.107 higher than that of the corresponding model using only demographic variables.http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdfgenome-wide association studypenalized regression modelpropensity scoretype 2 diabetes
spellingShingle	Chanwoo Park Nan Jiang Taesung Park Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes Genomics & Informatics genome-wide association study penalized regression model propensity score type 2 diabetes
title	Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_full	Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_fullStr	Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_full_unstemmed	Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_short	Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes
title_sort	pure additive contribution of genetic variants to a risk prediction model using propensity score matching application to type 2 diabetes
topic	genome-wide association study penalized regression model propensity score type 2 diabetes
url	http://genominfo.org/upload/pdf/gi-2019-17-4-e47.pdf
work_keys_str_mv	AT chanwoopark pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes AT nanjiang pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes AT taesungpark pureadditivecontributionofgeneticvariantstoariskpredictionmodelusingpropensityscorematchingapplicationtotype2diabetes

Pure additive contribution of genetic variants to a risk prediction model using propensity score matching: application to type 2 diabetes

Similar Items