Prof Art Owen
Abstract
Large scale genomic and electronic commerce data sets often have a crossed random effects structure, arising from genotypes x environments or customers x products. Naive methods of handling such data will produce inferences that do not generalize. Regression models that properly account for crossed random effects can be very expensive to compute. The cost of both generalized least squares and Gibbs sampling can easily grow as N^(3/2) (or worse) for N observations. Papaspiliopoulos, Roberts and Zanella (2020) present a collapsed Gibbs sampler that costs O(N), but under an extremely stringent sampling model. We propose a backfitting algorithm to compute a generalized least squares estimate and prove that it costs O(N) under greatly relaxed though still strict sampling assumptions. Empirically, the backfitting algorithm costs O(N) under further relaxed assumptions. We illustrate the new algorithm on a ratings data set from Stitch Fix.
This is joint work with Swarnadip Ghosh and Trevor Hastie of Stanford University.
The talk's recording is available here. Due to an unforeseen internet outage during Art’s talk, we were only able to record about 2/3 of Art’s talk.
Statistics Across Campuses
Stanford University
Thursday, 2 February 2023, 2pm
RC-4082 and Zoom (link below with passcode: 017349)