Yi, Lan. Biomarker discovery for microarray data by enriched methods, stochastic approximation and mixed effect models. Retrieved from https://doi.org/doi:10.7282/T3TH8K17
DescriptionNowadays microarray technology enables scientists to monitor the expression levels of hundreds of thousands of genes simultaneously. Because of the high cost of such experiments, the sample size is small, typically, only a few dozen. In this thesis, we propose a new perspective on microarray data. We believe microarray data generally contain three types of signals: specific signal, non-specific signal and spurious signal. We propose an enriched method for biomarker discovery which strengthens the specific signal (biomarkers) and weakens the spurious signal. We show that our enriched version of principal component analysis will highlight the specific signals in the data and can help separate different signals. We also show that enriched principal component analysis along with linear discriminant analysis will improve the classification and prediction of microarray data, comparing to some other popular methods. The results from our method are easy to interpret, too. We also prove the stochastic approximation procedure used in conditional t test converges under some general assumptions. Finally we discuss about analyzing the data from one novel experiment to find groups of genes (biomarkers), applying hierarchical clustering and nonlinear mixed effect models.