The spatial median and spatial sign covariance matrix (SSCM) are popularly used robust alternatives for estimating the location vector and scatter matrix when outliers are present or it is believed the data arises from some distribution that is not multivariate normal. When the underlying distribution is an elliptical distribution, it has been observed that these estimators perform better under certain scatter structures. This dissertation is a detailed study of the efficiencies of the spatial median and the SSCM under the elliptical model, in particular the dependence of their efficiencies on the population scatter matrix. For the spatial median, it is shown this estimator is asymptotically most efficient compared to the MLE for the location vector when the population scatter matrix is proportional to the identity matrix. Furthermore, it is possible to construct an affinely equivariant version of the spatial median that is asymptotically more efficient than the spatial median. Asymptotic relative efficiencies of these two estimators are calculated to demonstrate how inefficient the spatial median can be as the underlying scatter structure becomes more elliptical. A simulation experiment is carried out to provide evidence of analogous result for finite samples. When the goal is estimating eigenprojection matrices, it is proven that under the elliptical model the eigenprojection estimates obtained from the Tyler matrix are asymptotically more efficient than those corresponding to the SSCM. Calculations of asymptotic relative efficiencies are presented to demonstrate the loss of efficiency in using eigenprojection estimates of the SSCM as opposed to the Tyler matrix, particularly when the scatter structure of the data is far from spherical. To assess the performance of these estimators in the finite sample setting, the notion of principal angles is used to define a means to compare eigenprojection estimators. Using this concept, simulations are implemented that support finite sample results similar to those for the asymptotic case. The implications of the above results are discussed, particularly in the application of principal component analysis. Future research directions are then proposed.
Subject (authority = RUETD)
Topic
Statistics and Biostatistics
RelatedItem (type = host)
TitleInfo
Title
Rutgers University Electronic Theses and Dissertations
Rutgers University. Graduate School - New Brunswick
AssociatedObject
Type
License
Name
Author Agreement License
Detail
I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.