Symbolic Data Analysis: Distributions are the Numbers of the Future

Abstract:

Massively large data sets are routine and ubiquitous given modern computer capabilities. What is not so routine is how to analyse these data. One approach is to aggregate the data sets according to some scientific criteria. The resultant data are perforce symbolic data, i.e., lists, intervals, histograms, and so on. Applications abound, especially in the medical and social sciences. Other data sets (small or large in size) are naturally symbolic valued, such as species data, data with measurement uncertainties, confidential data, and the like. Unlike classical data which are points in p-dimensional space, symbolic data are hypercubes or Cartesian products of distributions in p-dimensional space. We describe such data and how they arise. We look briefly at some of the differences between classical and symbolic data and their respective methodologies, through illustrations.

Speaker

Prof Lynne Billard

Research Area

Statistics Seminar

Affiliation

University of Georgia (USA)

Date

Fri, 27/07/2012 - 4:00pm to 5:00pm

Venue

OMB-145, Old Main Building, UNSW Kensington Campus

Follow

Symbolic Data Analysis: Distributions are the Numbers of the Future

Abstract: