LanguageTerm (authority = ISO 639-3:2007); (type = text)
English
Abstract (type = abstract)
This dissertation presents work on three research projects in applied and computational statistics with a focus on clinical studies and medical data, including clinical trials and hospital databases. At the heart of these projects are a compression-based approach to data reduction in the supervised learning setting, a recursive-partitioning tree-based approach to exploratory subgroup discovery, and a comparative analysis of old and new ranking-based methodology for analyzing composite endpoints in right-censored time-to-event data. In the first project, we extend a method for reduction of big data in the unsupervised learning setting to the supervised learning setting for purposes of regression, model parameter estimation, and prediction. We use the example of linear regression to demonstrate the concept, examine theoretical asymptotic guarantees, and display finite-sample performance by example and simulation. We show that our method provides a very significant improvement over random sampling in the big data setting, even when the size of the reduced dataset is relatively small in comparison to the original dataset. In the second project, we extend a tree-based method for subgroup discovery to encompass new situations, including new endpoint classes and group comparisons, and provide an improved method for statistical significance quantification. Our work includes an extension of the original method to time-to-event data, which is extremely common in clinical trials when evaluating treatment effectiveness. We demonstrate the performance of resampling methodology for determining statistical significance of subgroup findings via simulation. In the third project, we present alterations to existing ordinal ranking methodology for evaluating treatment effectiveness in the composite survival endpoint setting. We present a simulation plan for comparing the performance of existing ranking-based composite outcome analysis methods and proposed alterations using realistic data generation models accounting for competing risks.
Subject (authority = RUETD)
Topic
Statistics
Subject (authority = LCSH)
Topic
Medical statistics
RelatedItem (type = host)
TitleInfo
Title
Rutgers University Electronic Theses and Dissertations
I hereby grant to the Rutgers University Libraries and to my school the non-exclusive right to archive, reproduce and distribute my thesis or dissertation, in whole or in part, and/or my abstract, in whole or in part, in and from an electronic format, subject to the release date subsequently stipulated in this submittal form and approved by my school. I represent and stipulate that the thesis or dissertation and its abstract are my original work, that they do not infringe or violate any rights of others, and that I make these grants as the sole owner of the rights to my thesis or dissertation and its abstract. I represent that I have obtained written permissions, when necessary, from the owner(s) of each third party copyrighted matter to be included in my thesis or dissertation and will supply copies of such upon request by my school. I acknowledge that RU ETD and my school will not distribute my thesis or dissertation or its abstract if, in their reasonable judgment, they believe all such rights have not been secured. I acknowledge that I retain ownership rights to the copyright of my work. I also retain the right to use all or part of this thesis or dissertation in future works, such as articles or books.