Backfitting for large scale crossed random effects regressions
Online
Online

Large scale genomic and electronic commerce data sets often have a crossed random effects structure, arising from genotypes x environments or customers x products. Regression models with crossed random effect error models can be very expensive to compute. The cost of both generalized least squares and Gibbs sampling can easily grow as N^(3/2) (or worse) for N observations. Papaspiliopoulos, Roberts and Zanella (2020) present a collapsed Gibbs sampler that costs O(N), but under an extremely stringent sampling model. We propose a backfitting algorithm to compute a generalized least squares estimate and prove that it costs O(N) under greatly relaxed though still strict sampling assumptions. Empirically, the backfitting algorithm costs O(N) under further relaxed assumptions. We illustrate the new algorithm on a ratings data set from Stitch Fix.

Backfitting for large scale crossed random effects regressions

Art Owen, Ph.D. - Stanford University

September 17, 2020
3:30 pm - 5:00 pm
Online
Online URL: https://umich.zoom.us/j/95897108285
Contact Information: Irene Felicetti: ilf@umich.edu

Large scale genomic and electronic commerce data sets often have a crossed random effects structure, arising from genotypes x environments or customers x products. Regression models with crossed random effect error models can be very expensive to compute. The cost of both generalized least squares and Gibbs sampling can easily grow as N^(3/2) (or worse) for N observations. Papaspiliopoulos, Roberts and Zanella (2020) present a collapsed Gibbs sampler that costs O(N), but under an extremely stringent sampling model. We propose a backfitting algorithm to compute a generalized least squares estimate and prove that it costs O(N) under greatly relaxed though still strict sampling assumptions. Empirically, the backfitting algorithm costs O(N) under further relaxed assumptions. We illustrate the new algorithm on a ratings data set from Stitch Fix.

Event Flyer for Backfitting for large scale crossed random effects regressions