Robust estimation of scale and covariance with Pn and its application to Principal Component Analysis

Abstract:

This talk outlines an intuitive, robust and highly efficient scale estimator, Pn, derived from the difference of two U-quantile statistics based on the same kernel as the Hodges-Lehmann estimate of location, h(x,y) = (x+y)/2 (Tarr, Müller and Weber, 2012). The asymptotic results for Pn in the iid setting follow from nesting Pn inside the class of generalised L-statistics (GL-statistics; Serfling, 1984) which encompasses U-statistics, U-quantile statistics and L-statistics and as such contains numerous well established scale estimators. The interquartile range, the difference of two quantiles, fits into the family of GL-statistics; as does the standard deviation which can be written as the square root of a U-statistic with kernel h(x, y) = (x-y)^2. The trimmed and Winsorised variance also fall into the class of GL-statistics. The robust scale estimator, Qn as introduced by Rousseeuw and Croux (1993), can be represented in terms of a U-quantile statistic corresponding to the kernel h(x, y) = |x-y|. The primary advantage of Pn is its high efficiency at the Gaussian distribution whilst maintaining desirable robustness and efficiency properties at heavy tailed and contaminated distributions. It will be shown how all of the aforementioned scale estimators can be transformed to covariance estimators through the device proposed by Gnanadesikan and Kettenring (1972). An application involving a simple and intuitive way to robustify principal components based on the Orthogonalised Gnanadesikan-Kettenring procedure (Maronna and Zamar, 2002) will also be presented.

Speaker

Garth Tarr

Research Area

Statistics Seminar

Affiliation

The University of Sydney

Date

Fri, 26/04/2013 - 4:00pm to 5:00pm

Venue

OMB-145, Old Main Building, UNSW Kensington Campus

Follow

Robust estimation of scale and covariance with Pn and its application to Principal Component Analysis

Abstract: