Genome Science Training Program Student Handbook

Driven by large-scale initiatives such as the Human Genome, ENCODE, Genotype-Tissue Expression (GTEx), Cancer Genome Atlas (TCGA), Centers for Common Disease Genomics (CCDG), Centers for Mendelian Genomics (CMG), TOPMed, and Human Microbiome Projects, genomics has taken a central role in the biomedical sciences. At the same time, advances in computation are driving the mathematical sciences forward. These factors, the increasingly quantitative nature of biomedical research, the explosive growth of genomic data, and the appreciation of the importance of big data and data science have resulted in a rapidly increasing demand for individuals trained at the interface of genomics and the mathematical sciences. The successful translation of genomic data to address questions of human health and disease requires individuals trained at this interface. At the same time, there is a severe shortage of individuals with this training. The focus on big data and data science in biomedicine is expanding this need further. Strong, quantitatively focused potential trainees are in extraordinary demand in competing areas, notably in business and finance.

The UM GSTP is one of the first NHGRI-funded T32s. Now in its 25th year of continuous funding, the GSTP has trained 122 individuals. The fundamental premise of the GSTP is that graduates should have substantial training in the mathematical sciences, the biological sciences, and at their interface. This training facilitates communication between disciplines, identification of important problems, and identification of the mathematical and computational tools required to solve those problems. Graduates are well-trained statistical genomicists, bioinformaticians, data scientists, molecular genomicists, or genomic epidemiologists well versed in statistics and computation who take positions in academics, government, or industry.

The University of Michigan is one of the nation's great public institutions of higher education, and has for many years been a leader in genetics, genomics, and the mathematical sciences. Faculty members in the Departments of Biostatistics, Epidemiology, Human Genetics, and Bioinformatics are involved in research and training in statistical, computational, and molecular genetics. Faculty in the Departments of Biological Chemistry, Ecology and Evolutionary Biology, Environmental Health Sciences, Microbiology and Immunology, Nutritional Sciences and Statistics also are involved in this activity (see faculty). Most teaching at the University is done during a two-semester academic year running from early September through late April. Graduate programs at the University are administered through the Horace H. Rackham School of Graduate Studies.

The Department of Biostatistics was established in 1959 in the School of Public Health and grew to prominence starting in 1971 under the leadership of Richard Cornell. Department research emphasizes the development of statistical methods and computational tools and their application to biomedicine. The Department has a strong research focus in genetics and genomics. Other Department methods strengths include computational statistics and big data; survival and longitudinal data; clinical trials; non- and semiparametric modeling; Bayesian methods; analysis of sample surveys; and analysis with missing data. Other substantive research areas include epidemiology, cancer, gerontology, organ failure and transplantation, diabetes, ophthalmology, and imaging. The Department is home to the UM Center for Statistical Genetics, directed by GSTP Director Michael Boehnke.

The Department of Epidemiology in the School of Public Health was established in 1941 by Thomas Francis Jr. The Department practices epidemiology as a broad scientific discipline addressing the causes of health and disease in populations, integrating causal concepts at the molecular, cellular, environmental, medical, and social levels. The Department has a strong commitment to research and training in genetic epidemiology. Other department research foci include infectious disease, chronic disease, molecular epidemiology, social epidemiology, psychiatric epidemiology, global health, environmental and occupational health, reproductive, perinatal, and pediatric health, and epidemiologic modeling and methods.

The Department of Human Genetics in the Medical School was established in 1956 under the leadership of James Neel, the first of its kind in this country. Faculty explore three broad areas of human genetics: molecular genetics, genetic disease, and evolutionary/population genetics. Within molecular genetics, they study DNA repair and recombination, genome instability, gene function and regulation, epigenetics, RNA modification and control, and genomic systems. Within human genetic disease, they study the genetics of development, neurogenetics, stem cell biology, medical genetics, reproductive sciences, and genetics of cancer. In evolutionary and population genetics, they develop and apply statistical tools for genetics, genetic epidemiology, and genetic mapping of complex traits.

The Bioinformatics Graduate Program was established in 1998 and is the educational arm of the Department of Computational Medicine and Bioinformatics. The mission of the Department is to create novel and impactful informatics and computationally-based methods, algorithms, tools, and resources to extend research capabilities and results. The Department has strong commitments to genomics and other “-omics”; systems and network biology; complex genetic diseases; computational biology methods; and biomedical data science.

Predoctoral trainees are expected to complete a dissertation with substantial content in statistical genomics. Doctoral committees include at least two GSTP faculty: one as chair and one from another department as an outside member. Consistent interaction between faculty and trainees helps ensure that GSTP requirements are met and graduates emerge as well-trained genome scientists: statistical genomicists, bioinformaticians, genomic epidemiologists, data scientists, or molecular genomicists well versed in statistics and computation. Predoctoral trainees complete all PhD requirements of their home department including coursework, qualifying exams, lab rotations (if required), and a dissertation, and core courses in statistical and human genetics. In addition to coursework, trainees are expected to become involved in research no later than the summer of their first academic year, and usually (much) sooner. Trainees also are encouraged to undertake a teaching preceptorship in which they work with a faculty member to teach a course.

Postdoctoral training varies depending on trainee background, but includes research involving statistical genetics and genomics, and usually includes auditing courses in molecular, statistical, and/or computational genomics. Postdoctoral research begins on arrival.

Predoctoral trainees are required to undertake 9-10 core courses totaling 26-29 credit hours and at least one elective core course of ≥3 credit hours. Core courses provide trainees with the fundamental knowledge needed to work at the interface of genetics and genomics and the mathematical sciences. Core courses include: genetics or human genetics; molecular biology or biological chemistry; a one year sequence in human genetics emphasizing human molecular genetics and genetic disease; a year of probability theory and statistical inference; a course in statistical genetics; a course in computing; course(s) in the responsible conduct of research; and one elective course. Each of the required courses is offered every year, most taught by GSTP faculty. Tutoring and study groups for GSTP trainees are available for these courses. Postdoctoral trainees generally audit some core courses depending on background and interests. Incoming students with limited biology or computing backgrounds are encouraged to undertake "Biology Boot Camp” or “Biocomputing Boot Camp” prior to starting classes.

Core courses may be waived by the Program Advisory Committee based on prior coursework. For example, Human Genetics trainees usually waive introductory genetics and molecular biology/biochemistry. Trainees are encouraged to discuss with their advisors and fellow trainees which among the alternate courses (Biostatistics 601/602 or Statistics 425/426; Molecular Biology or Biochemistry) are most appropriate given their backgrounds and interests(generally, Biostatistics 601/602 is more challenging mathematically than Statistics 425/426). There are also other particularly recommended courses.