Choi, Jeongsub. Sparse machine learning methodology and its applications to semiconductor manufacturing processes. Retrieved from https://doi.org/doi:10.7282/t3-3816-bm72
DescriptionIn this dissertation, we present new methodologies in machine learning for sparse solutions and the applications to semiconductor manufacturing processes. First, we present a new variant of relevance vector machine, called restricted relevance vector machine (RRVM), for incomplete data. Imputation is a common remedy to handle incomplete data that hinders from training a relevance vector machine (RVM) model. Imputation in kernel space for RVM leads to its prediction performance superior to imputation in original space but causes the loss of model sparsity. RRVM restricts its basis to be from complete instances incorporating incomplete instances for training. The experimental results show that RRVM performs prediction with a competitive accuracy, maintaining its model sparsity.
Next, we propose a new estimation method for Gaussian kernels with incomplete data. Gaussian kernels have been extensively used in kernel methods. A recent study proposes the estimation of the Gaussian kernels with incomplete data based on a function of the squared Euclidean distance between incomplete instances that is the sum of independent squared unit-dimensional distances, and it overlooks the correlations between missing unit-dimensional distances. In the proposed method, we model the squared Euclidean distance between incomplete instances as the sum of correlated squared unit-dimensional distances and estimate the Gaussian kernel from the expected kernel function under the distribution for the squared Euclidean distance between the instances. The experimental results show that the proposed method improves the prediction performance in a kernel method when missing components are correlated.
Furthermore, we present a new autoencoder for feature extraction from multistep process signals. Autoencoder is a neural network that reconstructs an input while representing the input in a lower-dimensional space from which features are obtained. The nature of the input from multistep process signals, however, is neglected by the autoencoder. The proposed autoencoder aims to extract features with smooth reconstruction by a fusion regularization on neighboring signals and with clipped penalties caused by the transient changes of the signals between consecutive subprocesses. A case study for virtual metrology at an etching process shows that the proposed method provides features for superior prediction performance.
Finally, we propose a new regularization, group-exclusive group lasso (GGL), in deep neural networks for automatic exclusive feature group selection. With group-level sparsity, group lasso facilitates the selection of feature groups, but it is difficult to avoid the coincident selection of the feature groups that are group-level correlated and that share their predictability to a response. GGL aims to enforce exclusive sparsity at an inter-group level to select salient feature groups. The experimental results show that GGL leads to higher feature group sparsity, maintaining competitive prediction accuracy.