Prosha Rahman
Abstract
Missing data is a pervasive issue in applied statistics, impacting nearly every aspect of data analysis. Common ad hoc strategies for restoring the rectangular structure of data sets include complete-case analysis and hot-deck imputation. More principled alternatives—such as likelihood-based methods, model-based imputations, and propensity score adjustments—have also been extensively studied. In this seminar, we examine regression analysis on data sets where the holes in the data set have been non-parametrically imputed using weighted k-nearest neighbours (kNN). Our study contrasts with many existing methods that impute the missing responses using model estimates that have been trained on complete cases. We present two bias-correction techniques, prove consistency of the resulting estimator, and identify conditions under which regression on the imputed data set outperforms inference restricted to complete cases.
This is joint work with Scott Sisson (UNSW), Boris Beranger (UNSW), and Siu-Ming Tam (ABS).
Statistics seminar
UNSW, Sydney
Friday, 11 April 2025, 4:00 pm
Microsoft Teams/ Anita B. Lawrence 4082