Abstract

Missing data is a pervasive issue in applied statistics, impacting nearly every aspect of data analysis. Common ad hoc strategies for restoring the rectangular structure of data sets include complete-case analysis and hot-deck imputation. More principled alternatives—such as likelihood-based methods, model-based imputations, and propensity score adjustments—have also been extensively studied. In this seminar, we examine regression analysis on data sets where the holes in the data set have been non-parametrically imputed using weighted k-nearest neighbours (kNN). Our study contrasts with many existing methods that impute the missing responses using model estimates that have been trained on complete cases. We present two bias-correction techniques, prove consistency of the resulting estimator, and identify conditions under which regression on the imputed data set outperforms inference restricted to complete cases.

This is joint work with Scott Sisson (UNSW), Boris Beranger (UNSW), and Siu-Ming Tam (ABS).

Speaker

Prosha Rahman

Research Area

Statistics seminar

Affiliation

UNSW, Sydney

Date

Friday, 11 April 2025, 4:00 pm

Venue

Microsoft Teams/ Anita B. Lawrence 4082