TY - JOUR TI - Classification and multiple testing for microarray data DO - https://doi.org/doi:10.7282/T3V987V3 PY - 2010 AB - This thesis aims to provide a solution to the classification and hypothesis testing problems as well as to create a tool to perform clustering, hypothesis testing or classification tasks automatically via simple menu-driven interface. Since the first appearance of microarrays in 1995, they became a technique for large gene expression screening worldwide. The quantity of data generated from microarray experiments is enormous, requiring new careful methods of analysis of these high-dimensional data. One of the problems encountered when dealing with this type of data is overfitting. Overfitting happens when information selected is related to the condition of interest only by chance. This thesis consists of four major parts. The first part contains the overview of microarray methodology and current techniques applied to analyze gene expression data. The second part uses partial least squares themed idea to develop the algorithm where one can control the FDR (false discovery rate) to extract differentially expressed genes in the analysis of gene expression data. The above procedure can be either used separately or as a part of the scheme where it provides weights that can be used together with another selection method or as a part of ensemble. The third part of the thesis deals with the problem of comparing several treatments to the control. In the setting where one wants to find a ‘bump’ in measurements of several groups, the test statistic is considered that is based on maximum and minimum of the group mean differences. Then the derived distribution of a proposed test statistic can be used to make inferences. The fourth part describes the software developed to provide a menu-driven computing environment for data manipulation and analysis. It includes different methods that can be used to compare expression profiles of genes and methods for gene clustering and various visualization and exploration. KW - Statistics and Biostatistics KW - DNA microarrays--Testing KW - Statistical hypothesis testing KW - Discriminant analysis LA - eng ER -