LASER

Locating Ancestry from Sequence Reads

LASER is a program to estimate individual ancestry by directly analyzing shotgun sequence reads without calling genotypes. LASER uses principal components analysis (PCA) and Procrustes analysis to analyze sequence reads of each sample and place the sample into a reference PCA space constructed using genotypes of a set of reference individuals. With an appropriate reference panel, the estimated coordinates of the sequence samples reflect their ancestral background and can be used to correct for population stratification in association studies. LASER can accurately estimate ancestry even with modest amounts of data, such as the off-target sequence data generated by targeted sequencing experiments.

In version 2.0 or later, the software package includes a new program TRACE for tracing an individual's genetic ancestry based on genotype data. TRACE follows the same analysis framework as LASER and can accurately place study samples into a reference ancestry space using a relatively small number of genotypes. When using the same reference panel, LASER and TRACE can place sequenced and genotyped samples into the same ancestry space.

LASER can also perform standard PCA on genotype data to explore population structure and to create the reference ancestry space. Different options to compute PC scores and PC loadings have been implemented in the LASER program (version 2.01 or later).

Comments and suggestions are welcome; please email Chaolong Wang or Gonçalo Abecasis.

If you use LASER, please take a minute to fill out the registration form. We will keep you updated when a new version is released.

References

LASER 1.0 algorithm:
C Wang, X Zhan, J Bragg-Gresham, HM Kang, D Stambolian, E Chew, K Branham, J Heckenlively, The FUSION Study, RS Fulton, RK Wilson, ER Mardis, X Lin, A Swaroop, S Zöllner, GR Abecasis (2014) Ancestry estimation and control of population stratification for sequence-based association studies. Nature Genetics 46:409-415.

LASER 2.0 algorithm:
C Wang, X Zhan, L Liang, GR Abecasis, X Lin (2015) Improved Ancestry estimation for both genotyping and sequencing data using projection Procrustes analysis and genotype imputation. American Journal of Human Genetics 96:926-37.

Downloads

Documentation

Notes

The HGDP data in Downloads are based on the Illumina 650K SNP data published by Li et al. (2008, Science 319: 1100-1104). We processed the data as described in our paper (Wang et al. 2014, Nature Genetics 46: 409-415). Main steps include updating genomic coordinates to Build 37, removing tri-allelic SNPs, flipping alleles to the forward strand, and formatting the data to a reference genotype format taken by the LASER program. We post the processed data to assist users of LASER. The original data can be downloaded from the Stanford HGDP website.

Software History

Details of the version changes are documented in Section 8 of the LASER manual.

  • April 28, 2015 - Upload version 2.02 software package
  • June 5, 2014 - Upload version 2.01 software package
  • May 19, 2014 - Upload version 2.0 software package
  • August 8, 2013 - Upload version 1.03 software package
  • June 19, 2013 - Upload version 1.02 software package
  • March 11, 2013 - Upload version 1.01 software package
  • February 1, 2013 - Upload version 1.0 software package and the HGDP reference panel