Reddy, Anupama Rajasekhara. Combinatorial pattern-based survival analysis with applications in biology and medicine. Retrieved from https://doi.org/doi:10.7282/T3PG1RXQ
DescriptionIn the current era of targeted therapies and personalized medicine, survival analysis (predicting survival time of patients) is a very important problem. Survival analysis is similar to regression except for the presence of censored observations (observations with incomplete survival time information). We propose to use a combinatorial pattern-based methodology, Logical Analysis of Data (LAD), for survival analysis. LAD is a two-class classification method. In this thesis we extend LAD for survival analysis in various ways. Our first approach is to define high- and low-risk patients, and reduce the problem to two-class classification. This approach is particularly useful for datasets with a large number of samples, and small number of features. In datasets where the feature space is high-dimensional (for example, gene expression data), we first used an unsupervised clustering approach to identify robust clusters in the data, the hypothesis being that the different clusters are associated with different survival profiles. We present a linear programming model to predict survival. Finally, we develop a new method, Logical Analysis of Survival Data (LASD), and validate it on a kidney cancer dataset. Ensemble methods are presented to improve the robustness of LASD.