Optimization in sparse learning: from convexity to non-convexity

Liu, Bo

doi:doi:10.7282/t3-y3br-8236

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Optimization in sparse learning

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(1.72 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Liu, Bo. Optimization in sparse learning. Retrieved from https://doi.org/doi:10.7282/t3-y3br-8236

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleOptimization in sparse learning

NameLiu, Bo (author); Metaxas, Dimitris N. (chair); Pavlovic, Vladimir (internal member); Elgammal, Ahmed (internal member); Katehakis, Michael (outside member); Rutgers University; School of Graduate Studies

Date Created2019

Other Date2019-01 (degree)

SubjectComputer Science, Machine learning

Extent1 online resource (100 pages) : illustrations

DescriptionNowadays, the explosive data scale increase provides an unprecedented opportunity to apply machine learning methods in various application domains. The high-dimension data representation proposes curse of dimension challenge to machine learning models. Sparse model offers a tool that can alleviate this challenge by learning a low-dimension feature and model representation. Traditionally, the sparse model is learned by penalizing the $ell_1$-norm of the model parameter in optimization. Recent sparse model learning research is studying more accurate way of modeling the sparsity degree as prior knowledge, representative work includes $k$-support norm regularized minimization and $ell_0$-constrained minimization.

If the training loss is convex, minimizing the training loss with model parameter $k$-support-norm regularizer is still a convex optimization problem. In chapter 2 we introduce the proposed fully corrective Frank-Wolfe type algorithm, called $k$-FCFW, for $k$-support-norm regularized sparse model learning.
We reformulate the regularized minimization into a constrained minimization task, then the the proposed algorithm is applied to solve the reformulated problem. In this work we compare the per-iteration complexity of the proposed $k$-FCFW algorithm with proximal gradient algorithm, which is conventionally used to solve the original problem. One theoretical contribution is that we establish a linear convergence for the proposed algorithm under some standard assumptions.

The model parameter $ell_0$-norm can be directly used as constraint in the learning objective.
However, in this condition even if training loss is convex, the problem is non-convex and NP-hard because of the $ell_0$-norm constraint. To obtain a tradeoff between model solving accuracy and efficiency, several primal domain greedy algorithms have been proposed. The algorithm properties such as model estimation error upper bound and support recovery are analyzed in literature. In chapter 3 we introduce the proposed dual space algorithm for the sparsity-constrained $ell_2$-norm regularized finite sum loss minimization problem. The sparse duality theory established in this work sets up the sufficient and necessary conditions under which the original non-convex problem can be equivalently solved in a concave dual formulation. The dual iterative hardthresholding (DIHT) algorithm and its stochastic variant are proved to be able to recover the parameter support without the Restricted Isometry Property (RIP) condition.

Distributed optimization algorithm is used for learning a global optimal model when training samples locate on different machines. Communication efficiency is one important concern in distributed model training algorithm design. Chapter 4 elaborates the proposed Newton-type inexact pursuit algorithm for the $ell_0$-constrained empirical risk minimization problem. The proposed algorithm iterates between inexactly solving a local sparse learning problem with existing single machine algorithms, and communicating gradient and model parameter between master machine and worker machines.
Algorithm analysis shows the model estimation error upper bound has linear convergence rate.

In the last part, we conclude this dissertation and present the future direction of sparse model learning research.

NotePh.D.

NoteIncludes bibliographical references

Noteby Bo Liu

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-y3br-8236

Languageeng

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide