Massively large data sets are routine and ubiquitous given modern computer capabilities. What is not so routine is how to analyse these data. One approach is to aggregate the data sets according to some scientific criteria. The resultant data are perforce symbolic data, i.e., lists, intervals, histograms, and so on. Applications abound, especially in the medical and social sciences. Other data sets (small or large in size) are naturally symbolic valued, such as species data, data with measurement uncertainties, confidential data, and the like.  Unlike classical data which are points in p-dimensional space, symbolic data are hypercubes or Cartesian products of distributions in p-dimensional space. We describe such data and how they arise. We look briefly at some of the differences between classical and symbolic data and their respective methodologies, through illustrations.


Prof Lynne Billard

Research Area

Statistics Seminar


University of Georgia (USA)


Fri, 27/07/2012 - 4:00pm to 5:00pm


OMB-145, Old Main Building, UNSW Kensington Campus