Genetics

apigenome

  • Libraries and command-line utilities for big data genomic analysis.
  • Faculty: Hyun Min KangDownload: Github.

Asthma/eQTL

  • mRNA by SNP Browser provides overviews of whole-genome association studies.
  • Download: Website.

Bama

  • Mediation analysis in the presence of high-dimensional mediators based on the potential outcome framework. Bayesian Mediation Analysis (BAMA), developed by Song et al (2018) <doi:10.1101/467399>.
  • Faculty: Bhramar Mukherjee, Min Zhang, Xiang ZhouDownload: CRAN.
  • Song, Y., Zhou, X., Zhang, M., Zhao, W., Liu, Y., Kardia, S., Roux, A.D., Needham, B., Smith, J.A. and Mukherjee, B., 2018. Bayesian Shrinkage Estimation of High Dimensional Causal Mediation Effects in Omics Studies. bioRxiv, p.467399.

bamUtil

  • Repository containing programs that perform operations on SAM/BAM files.
  • Download: Website.

BestRepeat

  • Variance components linkage analysis with repeated measurements.
  • Download: Website.

CaTS

  • Power Calculator for Two Stage Association Studies.
  • Faculty: Michael BoehnkeGoncalo AbecasisDownload: Website.
  • Reference: Skol, A.D., Scott, L.J., Abecasis, G.R. and Boehnke, M., 2006. Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies. Nature Genetics, 38(2), p.209.

chipenrich

  • Performs gene set enrichment for ChIP-seq peak data.
  • Faculty: Laura Scott. Download: Bioconductor.
  • Reference: Welch RP, Lee C, Cavalcante RG, Smith RA, Imbriano P, Scott LJ, Sartor MA (2014). “ChIP-Enrich: Gene set enrichment testing for ChIP-Seq data.” Nucleic Acids Research.

CisGenome Browser

  • A flexible stand-alone tool for genomic data visualization.
  • Faculty: Hui JiangDownload: Website.
  • Reference: Jiang, H., Wang, F., Dyer, N.P., Wong, W.H. (2010) CisGenome Browser: A Flexible Tool For Genomic Data Visualization, Bioinformatics, 26 (14).

CisGenome

  • An integrated tool for tiling array, genome and cis-regulatory element analysis, working together with CisGenome Browser. 
  • Faculty: Hui JiangDownload: Website.
  • Reference: Hongkai Ji, Hui Jiang, Wenxiu Ma, David S. Johnson, Richard M. Myers and Wing H. Wong (2008) An integrated software system for analyzing ChIP-chip and ChIP-seq data. Nature Biotechnology, 26: 1293-1300. doi:10.1038/nbt.1505.

cleancall

  • Correction for DNA contamination in genotype calling.
  • Faculty: Hyun Min Kang. Download: Github.

CNVEM

  • Infer carrier status of CNVs in large samples of SNP genotyping data.
  • Download: Website.

CoaCC

  • Simulate case-control study using a coalescent framework.
  • Download: Website.

CopyMap

  • CopyMap is based on a hidden Markov Model (HMM), predicting the location of CNVs and their allele frequencies using data from a set of CGH experiments.
  • Faculty: Sebastian Zöllner. Download: Website.

DAP

  • Integrative genetic association analysis using deterministic approximation of posteriors.
  • Faculty: Xiaoquan William Wen. Download: Github.
  • Reference: Wen, X., Lee, Y., Luca, F., Pique-Regi, R. Efficient Integrative Multi-SNP Association Analysis using Deterministic Approximation of Posteriors. The American Journal of Human Genetics, 98(6), 1114-1129.

DPR

  • DPR is a software package implementing the latent Dirichlet process regression method for genetic prediction of complex traits.
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Ping Zeng and Xiang Zhou (2017). Non-parametric genetic prediction of complex traits with latent Dirichlet process regression models. Nature Communications. 8: 456.

EMMAX

  • Statistical test for large scale human or model organism association mapping accounting for the sample structure.
  • Faculty: Hyun Min Kang. Download: Wiki.
  • Reference: Kang HM, Sul JH, Service SK, Zaitlen NA, Kong SY, Freimer NB, Sabatti C, Eskin E. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat. Genet. 42:348-54.

EPACTS

  • Efficient and Parallelizable Association Container Toolbox. Perform various statistical tests for identifying genome-wide association from sequence data through a user-friendly interface.
  • Faculty: Hyun Min Kang. Download: GithubWebsite.

FastQValidator

  • Validate format of fastq files.
  • Download: Website.

fmeqtl


FTEC

  • A coalescent simulator capable of modeling faster than exponential population growth.
  • Faculty: Sebastian Zöllner, Michael Boehnke. Download: Github.
  • References: Reppell, M., Boehnke, M. and Zöllner, S., 2012. FTEC: a coalescent simulator for modeling faster than exponential growth. Bioinformatics, 28(9), pp.1282-1283.

FUGUE

  • Construct haplotypes for the chromosome 22 and 19 linkage disequilibrium maps.
  • Faculty: Goncalo AbecasisDownload: Website.

GAS

  • Genetic Association Study (GAS) Power Calculator interface that can be used to compute statistical power for large one-stage genetic association studies.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Johnson, J.L. and Abecasis, G.R., 2017. GAS Power Calculator: web-based power calculator for genetic association studies. bioRxiv, p.164343.

GEMMA

  • GEMMA is the software implementing the Genome-wide Efficient Mixed Model Association algorithm for a standard linear mixed model and some of its close relatives for genome-wide association studies (GWAS).
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Xiang Zhou and Matthew Stephens (2012). Genome-wide efficient mixed-model analysis for association studies. Nature Genetics. 44: 821–824.

GeneNetwork

  • Gene sub-network analysis via Bayesian nonparametric methods.
  • Faculty: Jian KangDownload: Website.

GeneZoom

  • Show frequency of variants in a predefined region for groups of individuals.
  • Download: Website.

GENOME

  • Rapid coalescent-based whole genome simulator.
  • Download: Website.

GDEP

  • Gene network construction based on time course microarray data.
  • Faculty: Peter X.K. Song. Download: Website.
  • Reference: Gao, X., Pu, DQ., & Song, P.X.K. (2009). Transition dependency: a gene-gene interactionmeasure for times seriesmicroarray data. EURASIP Journal on Bioinformatics and Systems Biology, 2009, 2.

GOLD

  • Graphical Overview of Linkage Disequilibrium.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Abecasis, G.R. and Cookson, W.O.C., 2000. GOLD—graphical overview of linkage disequilibrium. Bioinformatics, 16(2), pp.182-183.

GotCloud

  • Incorporate alignment and variant calling pipelines into one easy to use tool.
  • Download: Website.

GREGOR

  • Test for enrichment of list of index SNPs in experimentally annotated regulatory domains.
  • Download: Website.

GRR

  • GRR is a Windows-based application for detecting pedigree errors via graphically inspecting the distribution for marker allele sharing among pairs of family members or all pairs of individuals in a study.
  • Faculty: Goncalo AbecasisDownload: Website.

iECAT

  • iECAT is an R-package to test for single variant and gene/region-based associations using external control samples.
  • Faculty: Seunggeun Shawn Lee. Download: Website, GithubCRAN.

IMAGE

  • IMAGE is a method that performs methylation quantitative trait locus (mQTL) mapping in bisulfite sequencing studies.
  • Faculty: Xiang Zhou. Download: CRAN, Github, Website.
  • Reference: Yue Fan, Tauras P. Vilgalys, Shiquan Sun, Qinke Peng, Jenny Tung and Xiang Zhou (2019). High-powered detection of genetic effects on DNA methylation using integrated methylation QTL mapping and allele-specific analysis. bioRxiv.

iMAP

  • iMAP is a method which performs integrative mapping of pleiotropic association and functional annotations using penalized Gaussian mixture models. 
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Ping Zeng, Xingjie Hao and Xiang Zhou. Pleiotropic Mapping and Annotation Selection in Genome-wide Association Studies with Penalized Gaussian Mixture Models. bioRxiv 2018. Doi: 10.1101/256461.

integrative

  • Enrichment estimation aided colocalization analysis.
  • Faculty: Xiaoquan William Wen. Download: Github.
  • Reference: Wen, X., Pique-Regi, R., Luca, F. Integrating Molecular QTL Data into Genome-wide Genetic Association Analysis: Probabilistic Assessment of Enrichment and Colocalization. PLOS Genetics. 2017 Mar 13(3): e1006646.

LAMP

  • LAMP is our software for Linkage and Association Modeling in Pedigrees.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Li, M., Boehnke, M. and Abecasis, G.R., 2005. Joint modeling of linkage and association: identifying SNPs responsible for a linkage signal. The American Journal of Human Genetics, 76(6), pp.934-949.

LASER

  • Estimate genetic ancestry on reference maps of diverse populations.
  • Download: Website.

LGEWIS

  • Functions for genome-wide association studies (GWAS)/gene-environment-wide interaction studies (GEWIS) with longitudinal outcomes and exposures.
  • Faculty: Seunggeun Shawn LeeBhramar Mukherjee, Min Zhang. Download: CRAN.
  • References: He et al. (2017) "Set-Based Tests for Gene-Environment Interaction in Longitudinal Studies" and He et al. (2017) "Rare-variant association tests in longitudinal studies, with an application to the Multi-Ethnic Study of Atherosclerosis (MESA)".

LocusZoom

  • Plot regional association results from genome-wide association scans.
  • Download: Website.

lodi

  • R package for imputing observed values below the limit of detection in single pollutant models via censored likelihood multiple imputation.
  • Download: CRAN
  • Reference: Boss et al (2019) <doi:10.1097/EDE.0000000000001052>.

MACH 1.0

  • MACH 1.0 is a Markov Chain based haplotyper that can resolve long haplotypes or infer missing genotypes in samples of unrelated individuals.
  • Faculty: Goncalo Abecasis. Download: Website.

Mendel

  • Perform likelihood-based statistical analysis.
  • Download: Website.

Merlin

  • Fast pedigree analyses, including non-parametric linkage, error detection and haplotyping.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Abecasis, G.R., Cherny, S.S., Cookson, W.O. and Cardon, L.R., 2001. Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nature Genetics, 30(1), p.97.

Metal

  • METAL software is designed to facilitate meta-analysis of large datasets (such as several whole genome scans) in a convenient, rapid and memory efficient manner.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Willer, C.J., Li, Y. and Abecasis, G.R., 2010. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics, 26(17), pp.2190-2191.

MetaSKAT

  • MetaSKAT is an R package for gene-based meta-analysis across studies. It can carry out a meta-analysis of SKAT, SKAT-O and burden tests with individual-level genotype data or gene-level summary statistics.
  • Faculty: Seunggeun Shawn Lee. Download: Website, GithubCRAN.

Minimac3

  • Computationally efficient implementation of MaCH algorithm for genotype imputation.
  • Download: Website.

mseq

  • An R package for modeling non-uniformity in short-read rates in RNA-Seq data.
  • Faculty: Hui JiangDownload: CRAN Archive.

Ordered Subset Analysis

  • Evaluate evidence for linkage even when heterogeneity is present.
  • Download: Website.

PEDSTATS

  • PEDSTATS is a handy tool for quick validation and summary of any pair of pedigree (.ped) and data (.dat) files.
  • Faculty: Goncalo AbecasisDownload: Website.

PMR-Egger

  • PMR-Egger is a method that fits probabilistic Mendelian randomization with an Egger regression assumption on horizontal pleiotropy for transcriptome-wide association studies (TWASs). 
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Zhongshang Yuan, Huanhuan Zhu, Ping Zeng, Sheng Yang, Shiquan Sun, Can Yang, Jin Liu and Xiang Zhou (2019). Testing and controlling for horizontal pleiotropy with the probabilistic Mendelian randomization in transcriptome-wide association studies.

PRSweb


PQLseq

  • PQLseq is a method that fits generalized linear mixed models for analyzing RNA sequencing and bisulfite sequencing data.
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Shiquan Sun*, Jiaqiang Zhu*, Sahar Mozaffari, Carole Ober, Mengjie Chen and Xiang Zhou (2018). Heritability estimation and differential analysis with generalized linear mixed models in genomic sequencing studies. Bioinformatics. in press.

PSEUDO


QPLOT

  • Calculates summary statistics to assess the sequencing quality of sequence reads.
  • Download: Website.

QTDT

  • Linkage Disequilibrium Analyses for Quantitative and Discrete Traits.
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Abecasis, G.R., Cardon, L.R. and Cookson, W.O.C., 2000. A general test of association for quantitative traits in nuclear families. The American Journal of Human Genetics, 66(1), pp.279-292.

RAREMETAL

  • Meta-analysis of rare variants from genotype arrays or sequencing.
  • Download: Website.

RAREMETAL WORKER

  • Generate summary statistics for gene level meta analyses in RAREMETAL.
  • Download: Website.

RELPAIR

  • RELPAIR 2.0.1 is a FORTRAN 77 program that infers the relationships of pairs of individuals based on genetic marker data, either within families or across an entire sample.
  • Faculty: Michael Boehnke. Download: Website.
  • Reference: Epstein MP, Duren WL and Boehnke M (2000) Improved inference of relationships for pairs of individuals. American Journal of Human Genetics 67:1219-1231.

RHMAP

  • RHMAP 3.0 (updated September 1996) is a statistical package for radiation hybrid mapping.
  • Faculty: Michael Boehnke. Download: Website.
  • Reference: Boehnke M, Lunetta K, Hauser E, Lange K, Uro J, and VanderStoep J. RHMAP: Statistical Package for Multipoint Radiation Version 3.0, September 1996.

rSeqNP

  • A non-parametric approach for detecting differential expression and splicing from RNA-Seq data.
  • Faculty: Hui Jiang. Download: Website.
  • Reference: Shi, Y., Chinnaiyan, A. M., Jiang, H. (2015) rSeqNP: A non-parametric approach for detecting differential ex-pression and splicing from RNA-Seq data Bioinformatics, in press.

rSeqDiff

  • Detecting differential isoform expression from RNA-seq data.
  • Faculty: Hui JiangDownload: Website.
  • Reference: Shi, Y., Jiang, H. (2013). rSeqDiff: Detecting differential isoform expression from RNA-Seq data using hierarchical likelihood ratio test, PLoS One, 8 (11): e79448.

rSeq

  • rSeq is a set of tools for RNA-Seq data analysis. It consists of programs that deal with many aspects of RNA-Seq data analysis, such as read quality assessment, reference sequence generation, sequence mapping, gene and isoform expressions (RPKMs) estimation, etc.
  • Faculty: Hui JiangDownload: Website.
  • References: [1] Jiang, H., Wong, W.H. (2009) Statistical Inferences for Isoform Expression in RNA-Seq, Bioinformatics, 25(8), 1026–1032. [2] Salzman, J., Jiang, H., Wong, W. H. (2011) Statistical Modeling of RNA-Seq Data, Statistical Science, 26 (1): 62-83.

SAIGE

  • SAIGE is an R-package for testing for associations between genetic variants and binary phenotypes with adjusting for sample relatedness and case-control imbalance.
  • Faculty: Seunggeun Shawn Lee. Download: Website, Github.

SAMBA-EHR

  • Explore sampling and misclassification biases in association analyses from GWAS/PheWAS using EHR.
  • Download: Website.

SeqAlto

  • Fast and accurate read alignment for resequencing.
  • Faculty: Hui JiangDownload: Website.
  • References: John C. Mu, Hui Jiang, Amirhossein Kiani, Marghoob Mohiyuddin, Narges Bani Asadi and Wing H. Wong, Fast and Accurate Read Alignment for Resequencing, Bioinformatics, 2012.

SeqMap

  • A tool for mapping millions of short sequences to the genome.
  • Faculty: Hui Jiang. Download: Website.
  • References: Jiang, H., Wong, W.H. (2008) SeqMap: Mapping Massive Amount of Oligonucleotides to the Genome, Bioinformatics, 24(20).

SIBMED

  • SIBMED 1.0 is a FORTRAN 77 program that identifies likely genotyping errors and mutations for a sib pair in the context of multipoint mapping.
  • Faculty: Michael Boehnke. Download: Website.
  • Reference: Douglas J.A. and Boehnke M. SIBMED: A Program that Identifies Likely Genotyping Errors and Mutations for a Sib Pair in the Context of Multipoint Mapping Version 1.0, April 18, 2000.

SIMLINK

  • SIMLINK 4.12 (updated April 1997) is a program for estimating the power of a proposed linkage study by computer simulation.
  • Faculty: Michael Boehnke. Download: Website.

SKAT

  • SKAT is an R-package for rare variant association analysis. It can carry out burden test, SKAT, SKAT-O, and combined test of common and rare variants with adjusting for covariates and kinship. For binary traits, it can calculate p-values using resampling and asymptotic based adjustment methods. It also has functions for sample size and power calculations.
  • Faculty: Seunggeun Shawn Lee. Download: Website, GithubCRAN.

SNIPPER

  • Look up information on genes near SNPs of interest from public databases.
  • Download: Website.

SNP-HWE

  • Fast exact Hardy-Weinberg Equilibrium test for SNPs as described in Wigginton, et al. (2005).
  • Faculty: Goncalo AbecasisDownload: Website.
  • Reference: Wigginton, J.E., Cutler, D.J. and Abecasis, G.R., 2005. A note on exact tests of Hardy-Weinberg equilibrium. The American Journal of Human Genetics, 76(5), pp.887-893.

SPARK

  • SPARK is a method for detecting genes with spatial expression patterns in spatially resolved transcriptomic studies.
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Shiquan Sun*, Jiaqiang Zhu* and Xiang Zhou (2019). Statistical analysis of spatial expression pattern for spatially resolved transcriptomic studies.

SPAtest

  • SPAtest is an R-package to perform score test for associations between genetic variants and binary traits using saddlepoint approximation. The methods implemented in the package (FastSPA) can accurately calculate p-values even when the case-control ratio is extremely unbalanced. 
  • Faculty: Seunggeun Shawn Lee. Download: WebsiteCRAN.

SpliceMap

  • SpliceMap is a de novo splice junction discovery and alignment tool. It offers high sensitivity and support for arbitrary RNA-seq read lengths.
  • Faculty: Hui JiangDownload: Website.
  • Reference: Kin Fai Au, Hui Jiang, Lan Lin, Yi Xing, and Wing Hung Wong. Detection of splice junctions from paired-end RNA-seq data by SpliceMap. Nucleic Acids Research, Advance access published on April 5, 2010.

SPOTTER

  • Identify regions of interest (using LD and recombination) near associated variants.
  • Download: Website.

subgxe

  • R package that implements p-value assisted subset testing for association (pASTA), a method developed by Yu et al. (2019) <doi:10.1159/000496867>.
  • Faculty: Bhramar Mukherjee, Xiang Zhou, Seunggeun Shawn Lee. Download: CRAN.
  • References: Yu, Y., Xia, L., Lee, S., Zhou, X., Stringham, H.M., Boehnke, M. and Mukherjee, B., 2018. Subset-Based Analysis using Gene-Environment Interactions for Discovery of Genetic Associations across Multiple Studies or Phenotypes. Human Heredity, 83(6), pp.283-314.

Tnseq

  • Identification of conditionally essential genes using high-throughput sequencing data from transposon mutant libraries.
  • Faculty: Lili Zhao. Download: CRAN.
  • Reference: Zhao, L., Anderson, M.T., Wu, W., Mobley, H.L. and Bachman, M.A., 2017. TnseqDiff: identification of conditionally essential genes in transposon sequencing studies. BMC Bioinformatics, 18(1), p.326.

 TORUS

  • QTL discovery utilizing genomic annotations. Computational procedure for discovering molecular QLTs incorporating genomic annotations.
  • Faculty: Xiaoquan William Wen. Download: Github.
  • Reference: Wen, X. Effective QTL Discovery Incorporating Genomic Annotations. bioRxiv doi:10.1101/032003.

TRAFIC

  • TRAFIC (Test for Rare-variant Association using Family-based Internal Controls) tests for rare variant associations in affected sibpairs by comparing the allele count of rare variants on chromosome regions shared identical by descent (IBD) to the allele count of rare variants on non-shared chromosome regions.
  • Faculty: Sebastian Zöllner. Download: Github, website.
  • References: Lin, K.H. and Zöllner, S., 2015. Robust and powerful affected sibpair test for rare variant association. Genetic Epidemiology, 39(5), pp.325-333.

TreeLD

  • Infer ancestry of genomic region and analyzes for signals of disease mutations.
  • Download: Website.

TransMeta & TransMetaRare

  • TransMeta is an R-package to compute single SNP p-values of trans-ethnic meta-analysis using a kernel-based random effect model. This is an early version, and we will keep updating it. We have recently extended it to gene-based rare-variant test (Transmeta-rare). The packages can be downloaded from the following github.
  • Faculty: Seunggeun Shawn Lee. Download: WebsiteTransMetaRare Github.

VerifyBamID

  • Software that verifies whether the reads in particular file match previously known genotypes for an individual (or group of individuals), and checks whether the reads are contaminated as a mixture of two samples.
  • Faculty: Michael Boehnke. Download: Website.
  • Reference: G. Jun, M. Flickinger, K. N. Hetrick, Kurt, J. M. Romm, K. F. Doheny, G. Abecasis, M. Boehnke,and H. M. Kang, Detecting and Estimating Contamination of Human DNA Samples in Sequencing and Array-Based Genotype Data, American Journal of Human Genetics doi:10.1016/j.ajhg.2012.09.004 (volume 91 issue 5 pp.839 - 848.

VIPER

  • VIPER is a method that performs Variability Preserving ImPutation for Expression Recovery in single cell RNA sequencing studies.
  • Faculty: Xiang Zhou. Download: Github, Website.
  • Reference: Mengjie Chen and Xiang Zhou (2018). VIPER: variability-preserving imputation foraccurate gene expression recovery insingle-cell RNA sequencing studies. Genome Biology. 19:196.

WHODAD

  • WHODAD is a software package implementing the WHODAD method for paternity inference from low-coverage sequencing data. 
  • Faculty: Xiang Zhou. Download: Website.
  • Reference: Noah Snyder-Mackler, William H Majoros, Michael L Yuan, Amanda O Shaver, Jacob B Gordon, Gisela H Kopp, Stephen A Schlebusch, Jeffrey D Wall, Susan C Alberts, Sayan Mukherjee, Xiang Zhou and Jenny Tung (2016). Efficient genome-wide sequencing and low-coverage pedigree analysis from non-invasively collected samples. Genetics. 203: 699-714.

WINNER

  • WINNER 1.1 (updated Feb 2009) is a program for correcting the winner's curse effect in genetic associations studies.
  • Faculty: Michael Boehnke. Download: Website.
  • Reference: Rui Xiao and Michael Boehnke 2009. Quantifying and Correcting in Genetic Association Studies. Genetic Epidemiology 33:453-462.