Distribution-free fault identification and anomaly detection in high-dimensional data

Turkoz, Mehmet

doi:doi:10.7282/T3XS5ZV3

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Distribution-free fault identification and anomaly detection in high-dimensional data

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(3.07 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Turkoz, Mehmet. Distribution-free fault identification and anomaly detection in high-dimensional data. Retrieved from https://doi.org/doi:10.7282/T3XS5ZV3

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleDistribution-free fault identification and anomaly detection in high-dimensional data

NameTurkoz, Mehmet (author); Jeong, Myong K. (chair); Elsayed, Elsayed A. (internal member); Pham, Hoang (internal member); Xie, Minge (outside member); Rutgers University; School of Graduate Studies

Date Created2018

Other Date2018-05 (degree)

SubjectIndustrial and Systems Engineering, Manufacturing processes, Big data

Extent1 online resource (xviii, 183 p. : ill.)

DescriptionQuality engineering is an essential activity in production processes and its objective is to ensure the quality of the products throughout the production stages. Many processes have several attributes that need to be continuously monitored to detect any variable changes in the production process. We refer to the monitoring process with several quality characteristics as multivariate statistical process control (MSPC). Most of the quality control procedures assume that the characteristics of the process follow normal distributions; however, this is a limiting assumption since the underlying distribution of the processes may not be normal. In this dissertation, we present procedures to identify the faulty variables and detect anomalies in MSPC with high dimensional data when the underlying distribution of the process is unknown. We first propose a distribution-free adaptive step-down (DFASD) procedure, which is motivated by a well-known data description method called support vector data description (SVDD). This data description procedure includes the support vectors which identify the hypersphere boundary for the available data by using the kernel concept. In a high-dimensional process, identifying the variable or a subset of variables, which cause an out-of-control (OC) signal, is a challenging issue in quality engineering. DFASD procedure utilizes conditional statistics for the identification of faulty variables. The proposed DFASD procedure selects a variable having no significant evidence of a change at each step based on the variables that are selected in the previous steps. The proposed DFASD stops when there are no longer variables to classify to the unchanged set. Therefore, it concludes the variables which are not in the unchanged set as changed variables. We then present a new distribution-free fault identification procedure based on Bayesian inference which is called Bayesian SVDD (BSVDD). While the traditional SVDD assumes that the process parameters are constants to be determined, the center of hypersphere may be considered as a random vector with inherent randomness based on a given training dataset. We introduce a Bayesian approach for SVDD by assuming that a transformed data into the higher dimensional space follow normal distribution. A distance from a point to the center of the hypersphere is inversely proportional to the likelihood in the proposed model. This is because SVDD is a special case of the proposed BSVDD model, which improves SVDD by utilizing the precise prior knowledge. Therefore, by combining proposed BSVDD with an adaptive step-down procedure, we drive a new BSVDD based fault identification procedure for the MSPC. This is the first research to identify the faulty variables by using the distribution-free approach based on Bayesian inference. We also present an anomaly detection procedure which is easily applicable in detecting anomalies in multimode processes. Traditional quality control procedures assume that normal observations are obtained from a single distribution. However, due to the complexities of modern industrial processes, the observations might have multiple operating modes. In other words, normal observations may be obtained from more than one distribution. In such cases, conventional quality control procedures might trigger false alarms while the process is indeed in another operating mode. We propose a generalized support vector-based anomaly detection procedure called n-class SVDD which can be used to determine the anomalies in multimode processes. The proposed procedure constructs n hyperspheres by considering the relationship among modes. In addition, we introduce a generalized Bayesian framework by not only considering the prior information from each mode but also the relationships among the modes. Finally, we present a new Bayesian procedure for anomaly detection in multi-class data. The existing procedures for anomaly detection mostly take only the normal information into account. However, the anomaly information is often available from the engineering knowledge and the historical data of the process. The performance of the anomaly detection procedures can be improved when available anomaly data are utilized to obtain data description. We propose a multi-class Bayesian SVDD model that takes anomaly data into consideration when the anomaly data are available and an appropriate prior distribution of the anomaly data is obtained.

NotePh.D.

NoteIncludes bibliographical references

Noteby Mehmet Turkoz

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/T3XS5ZV3

Languageeng

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide