Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci.
<h4>Motivation</h4>Genome-wide association studies (GWAS) have identified genetic variants, usually single-nucleotide polymorphisms (SNPs), associated with human traits, including disease and disease risk. These variants (or causal variants in linkage disequilibrium with them) usually af...
Saved in:
Main Authors: | , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2025-01-01
|
Series: | PLoS Computational Biology |
Online Access: | https://doi.org/10.1371/journal.pcbi.1012725 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832540326190907392 |
---|---|
author | Zeinab Mousavi Marios Arvanitis ThuyVy Duong Jennifer A Brody Alexis Battle Nona Sotoodehnia Ali Shojaie Dan E Arking Joel S Bader |
author_facet | Zeinab Mousavi Marios Arvanitis ThuyVy Duong Jennifer A Brody Alexis Battle Nona Sotoodehnia Ali Shojaie Dan E Arking Joel S Bader |
author_sort | Zeinab Mousavi |
collection | DOAJ |
description | <h4>Motivation</h4>Genome-wide association studies (GWAS) have identified genetic variants, usually single-nucleotide polymorphisms (SNPs), associated with human traits, including disease and disease risk. These variants (or causal variants in linkage disequilibrium with them) usually affect the regulation or function of a nearby gene. A GWAS locus can span many genes, however, and prioritizing which gene or genes in a locus are most likely to be causal remains a challenge. Better prioritization and prediction of causal genes could reveal disease mechanisms and suggest interventions.<h4>Results</h4>We describe a new Bayesian method, termed SigNet for significance networks, that combines information both within and across loci to identify the most likely causal gene at each locus. The SigNet method builds on existing methods that focus on individual loci with evidence from gene distance and expression quantitative trait loci (eQTL) by sharing information across loci using protein-protein and gene regulatory interaction network data. In an application to cardiac electrophysiology with 226 GWAS loci, only 46 (20%) have within-locus evidence from Mendelian genes, protein-coding changes, or colocalization with eQTL signals. At the remaining 180 loci lacking functional information, SigNet selects 56 genes other than the minimum distance gene, equal to 31% of the information-poor loci and 25% of the GWAS loci overall. Assessment by pathway enrichment demonstrates improved performance by SigNet. Review of individual loci shows literature evidence for genes selected by SigNet, including PMP22 as a novel causal gene candidate. |
format | Article |
id | doaj-art-6e5afd4d2f5646d698b250a336f288fd |
institution | Kabale University |
issn | 1553-734X 1553-7358 |
language | English |
publishDate | 2025-01-01 |
publisher | Public Library of Science (PLoS) |
record_format | Article |
series | PLoS Computational Biology |
spelling | doaj-art-6e5afd4d2f5646d698b250a336f288fd2025-02-05T05:30:39ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-01-01211e101272510.1371/journal.pcbi.1012725Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci.Zeinab MousaviMarios ArvanitisThuyVy DuongJennifer A BrodyAlexis BattleNona SotoodehniaAli ShojaieDan E ArkingJoel S Bader<h4>Motivation</h4>Genome-wide association studies (GWAS) have identified genetic variants, usually single-nucleotide polymorphisms (SNPs), associated with human traits, including disease and disease risk. These variants (or causal variants in linkage disequilibrium with them) usually affect the regulation or function of a nearby gene. A GWAS locus can span many genes, however, and prioritizing which gene or genes in a locus are most likely to be causal remains a challenge. Better prioritization and prediction of causal genes could reveal disease mechanisms and suggest interventions.<h4>Results</h4>We describe a new Bayesian method, termed SigNet for significance networks, that combines information both within and across loci to identify the most likely causal gene at each locus. The SigNet method builds on existing methods that focus on individual loci with evidence from gene distance and expression quantitative trait loci (eQTL) by sharing information across loci using protein-protein and gene regulatory interaction network data. In an application to cardiac electrophysiology with 226 GWAS loci, only 46 (20%) have within-locus evidence from Mendelian genes, protein-coding changes, or colocalization with eQTL signals. At the remaining 180 loci lacking functional information, SigNet selects 56 genes other than the minimum distance gene, equal to 31% of the information-poor loci and 25% of the GWAS loci overall. Assessment by pathway enrichment demonstrates improved performance by SigNet. Review of individual loci shows literature evidence for genes selected by SigNet, including PMP22 as a novel causal gene candidate.https://doi.org/10.1371/journal.pcbi.1012725 |
spellingShingle | Zeinab Mousavi Marios Arvanitis ThuyVy Duong Jennifer A Brody Alexis Battle Nona Sotoodehnia Ali Shojaie Dan E Arking Joel S Bader Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. PLoS Computational Biology |
title | Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. |
title_full | Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. |
title_fullStr | Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. |
title_full_unstemmed | Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. |
title_short | Prioritization of causal genes from genome-wide association studies by Bayesian data integration across loci. |
title_sort | prioritization of causal genes from genome wide association studies by bayesian data integration across loci |
url | https://doi.org/10.1371/journal.pcbi.1012725 |
work_keys_str_mv | AT zeinabmousavi prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT mariosarvanitis prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT thuyvyduong prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT jenniferabrody prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT alexisbattle prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT nonasotoodehnia prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT alishojaie prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT danearking prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci AT joelsbader prioritizationofcausalgenesfromgenomewideassociationstudiesbybayesiandataintegrationacrossloci |