A flexible Bayesian variable selection model was crafted for accounting for functional annotations of genetic variants for identifying risk signals.
A group of researchers has developed a statistical method and tool — a flexible Bayesian variable selection model – for accounting for functional annotations of genetic variants for identifying risk signals.
In the Bayesian functional genome-wide association study (bfGWAS), in order to develop this tool to help identify and prioritize true associations in genome-wide association studies, the researchers used information gathered by the Michigan Genomics Initiative (MGI) on patients with age-related macular degeneration (AMD) on the day they had elective surgery or other procedures at the University of Michigan Health System.
The model, according to the researchers, pairs a “flexible Bayesian method with an efficient computational algorithm” to put functional information alongside association mapping. Additionally, this method accounts for linkage disequilibrium (LD), and shares information across the entire genome in order to increase association-mapping power.
A total of 33976 unrelated samples were used, 16144 of which with advanced AMD and 17832 control subjects. Additionally, there were 12.02 million samples “genotyped on a customized Exome-Chip and imputed against the 1000 Genomes Project phase I reference panel,” reported the researchers.
When researchers perform a genome-wide association study (GWAS), a series of associated loci are identified, then researchers examine the associated variants within each locus independently. However, using bfGWAS, the researchers found that incorporating functional information made it easier to identify multiple signals in a locus.
Jingjing Yang (pictured), PhD, a research fellow at the Center for Statistical Genetics in the Biostatistics Department at the University of Michigan and the lead author, told MD Magazine that the model proposed in the current study, “uses known annotation information, learns the importance of each type of annotations from the GWAS results, and then uses this learned importance information to prioritize association signals.”
The difference between bfGWAS and previously developed methods, Yang said, is that “the bfGWAS method makes more realistic assumptions for general GWAS, such as not restricting there is at most 1 independent signal per risk locus.” These more realistic assumptions make bfGWAS more accurate when the actual genetic architecture features multiple signals per locus.
“Our method is scalable up to genome-wide analysis with thousands of samples and millions of variants,” Yang said. “The rich number of loci makes the AMD data a good example to show the benefits of our method, e.g., identifying the most likely causal annotations for AMD, and using the association enrichment information to fine-map risk loci, especially outperforming the existing method of fGWAS for loci with multiple signals.”
AMD has 32 known risk loci. In the current study, the researchers were able to identify 5 additional risk loci beyond the known 32.
Within the clinical setting, this new method “can account for known functional information of the genome-wide genetic variants, and produce fine-mapped associated genetic variants for the studied complex traits of diseases,” according to Yang.
She added that the method will “prioritize variants with annotations that are estimated with higher causal probability.” The researchers claimed they believe that an improved method of dividing the genome into LD blocks, which has been a subject of other recent studies, will increase the association mapping power.