Nonparametric genetic prediction of complex traits with latent Dirichlet process regression models
University of Michigan School of Public Health
3755 SPH I, 1415 Washington Heights Ann Arbor, MI 48109-2029
There has been a growing interest in using genotype data to perform genetic prediction of complex traits. Accurate genetic prediction can facilitate genomic selection in animal and plant breeding programs, and can aid in the development of personalized medicine in humans. Because most complex traits have a polygenic architecture and are each influenced by many genetic variants with small effects, accurate genetic prediction requires the development of polygenic methods that can model all genetic variants jointly. Many recently developed polygenic methods make parametric modeling assumptions on the effect size distribution and different polygenic methods differ in such effect size assumption. However, depending on how well the effect size distribution assumption matches the unknown truth, existing polygenic methods can perform well for some traits but poorly for others. To enable robust phenotype prediction performance across a range of phenotypes, we develop a novel polygenic model with a flexible assumption on the effect size distribution. We refer to our model as the latentDirichlet Process Regression (DPR). DPR relies on the Dirichlet process to assign a prior on the effect size distribution itself, is non-parametric in nature, and is capable of inferring the effect size distribution from the data at hand. Because of the flexible modeling assumption, DPR is able to adapt to a broad spectrum of genetic architectures and achieves robust predictive performance for a variety of complex traits. We compare the predictive performance of DPR with several commonly used polygenic methods in simulations. We further illustrate the benefits of DPR by applying it to predict gene expressions using cis-SNPs, to conduct PrediXcan based gene set test, to perform genomic selection of four traits in two species, and to predict five complex traits in a human cohort. Our method is implemented in the DPR software, freely available at www.xzlab.org/software.html. Department of Biostatistics

Nonparametric genetic prediction of complex traits with latent Dirichlet process regression models

Xiang Zhou, Ph.D. - Biostatistics, University of Michigan

icon to add this event to your google calendarSeptember 13, 2018
3:30 pm - 5:00 pm
3755 SPH I
1415 Washington Heights
Ann Arbor, MI 48109-2029
Sponsored by: Department of Biostatistics
Contact Information: Zhenke Wu (zhenkewu@umich.edu) and Peisong Han (peisong@umich.edu

There has been a growing interest in using genotype data to perform genetic prediction of complex traits. Accurate genetic prediction can facilitate genomic selection in animal and plant breeding programs, and can aid in the development of personalized medicine in humans. Because most complex traits have a polygenic architecture and are each influenced by many genetic variants with small effects, accurate genetic prediction requires the development of polygenic methods that can model all genetic variants jointly. Many recently developed polygenic methods make parametric modeling assumptions on the effect size distribution and different polygenic methods differ in such effect size assumption. However, depending on how well the effect size distribution assumption matches the unknown truth, existing polygenic methods can perform well for some traits but poorly for others. To enable robust phenotype prediction performance across a range of phenotypes, we develop a novel polygenic model with a flexible assumption on the effect size distribution. We refer to our model as the latentDirichlet Process Regression (DPR). DPR relies on the Dirichlet process to assign a prior on the effect size distribution itself, is non-parametric in nature, and is capable of inferring the effect size distribution from the data at hand. Because of the flexible modeling assumption, DPR is able to adapt to a broad spectrum of genetic architectures and achieves robust predictive performance for a variety of complex traits. We compare the predictive performance of DPR with several commonly used polygenic methods in simulations. We further illustrate the benefits of DPR by applying it to predict gene expressions using cis-SNPs, to conduct PrediXcan based gene set test, to perform genomic selection of four traits in two species, and to predict five complex traits in a human cohort. Our method is implemented in the DPR software, freely available at www.xzlab.org/software.html.