Yao, Yisha. Estimation and inference in high-dimensional models and algorithms for statistical learning. Retrieved from https://doi.org/doi:10.7282/t3-ebmk-7k70
DescriptionThis thesis presents three projects, including adaptive estimation in high-dimensional additive models with multi-resolution group Lasso, construction of confidence intervals for the signals in sparse phase retrieval, and extension of the AMP algorithm to general Gaussian designs. The first two projects address the estimation and inference in high-dimensional models, respectively. The third project is about algorithm design for the compressed sensing problem. In the first project, we propose a multi-resolution group Lasso method in a unified approach for regularized estimation in high-dimensional additive models. This method simultaneously achieve or improve existing error bounds and provide new ones without the knowledge of the level of sparsity or the degree of smoothness of the unknown functions. Another contribution of this project is that we established the empirical groupwise compatibility condition based on its theoretical version under nearly optimal sample size requirement. And the empirical groupwise compatibility condition facilitates theoretical analysis in the random design setting. In the second project, we provide a general methodology to draw statistical inferences on targeted parameters in sparse phase retrieval. The target parameters can be individual signal coordinates (βk's) or linear combinations of them. Given an initial estimator for the targeting parameter, we can modify it with some bias-correction procedure to achieve parametric convergence rate and asymptotic normality. The initial estimator could be generated by some existing algorithms for sparse phase retrieval.
Consequently, the confidence intervals and hypothesis testing procedures can be constructed based on this de-biased estimator. We focus on constructing confidence intervals in this work, while the idea extends naturally to hypothesis testing procedures.
Under some mild assumptions on the signal and sample size, theoretical guarantees are established for the proposed method. These assumptions are mild in the sense that they allow the dimension to exceed the sample size and many non-zero small coordinates. Furthermore, theoretical analysis reveals that the de-biased estimators for individual coordinates have uniformly bounded variance, and hence simultaneous interval estimation is justified. Numerical simulations in a wide range of settings support our theoretical results. In the third project, we develop an AMP algorithm for compressed sensing with correlated Gaussian designs. The Approximated Message Passing (AMP) algorithm is originally designed to solve the compressed sensing or sparse signal recovery problem with standard Gaussian designs. It is much faster than the existing convex procedures while achieves a comparable sparsity-undersampling tradeoff. However, the original AMP algorithm relies heavily on the independence among the covariates. Moreover, implementing the original AMP algorithm requires the knowledge of the empirical distribution of the true signal, which is unrealistic in most applications. We modify the original AMP algorithm so that it applies to general Gaussian designs under mild regularization conditions on the population covariance matrix. Furthermore, the modified AMP does not require knowledge of the empirical distribution of the true signal. Some heuristic theoretical analysis is provided and numerical simulation exhibits desirable properties of the modified AMP algorithm.