TY - JOUR
TI - Methodology for analyzing preclinical experiments
DO - https://doi.org/doi:10.7282/t3-d6k8-2t13
PY - 2019
AB - The dissertation examines three distinct methodologies for analyzing data yielded from preclinical experiments:
1) Big data has created new challenges for data analysis, particularly in the realm of creating meaningful groups or clusters of data or classification. Clustering techniques, such as K-means or hierarchical clustering, based on pairwise distances of N objects, are popular methods for performing exploratory analysis on large datasets such as these. Unfortunately, these methods are not always possible to apply to big data due to memory or time constraints generated by calculations of order N^2. A work-around is to take a random sample of the large dataset and perform the clustering technique with the reduced dataset; however, this is not a foolproof solution since the structure of the dataset, particularly at the edges of the dataset, is not guaranteed to be maintained. In this chapter we will propose a new solution through the concept of “data nuggets”. These data nuggets reduce a larger dataset into a small collection of nuggets of data, each containing a center, weight, and a scale parameter. Once the data is re-expressed as data nuggets, we may apply algorithms that compute standard statistical methods, such as principal components analysis (PCA), clustering, classification, etc. We apply the methodology of data nuggets to the analysis of a dataset from flow cytometry in Biopharmaceutical research. This was conducted by performing weighted PCA and weighted K-means clustering on a dataset containing millions of observations (B-cells), and the objective was to find clusters that characterize cells according to which proteins are active on their surfaces. An R package was also developed to conduct these methods.
2) There are many cases in preclinical drug discovery when experiments are repeated but not precisely replicated regarding treatment arms. Further, full datasets are not always immediately accessible, leaving analysts to rely on summary measures such as sample mean and standard error. If one is only interested in comparing two treatment arms at a time, meta-analysis is a useful tool; however, when one applies this method they are limited to only comparing two of the potentially numerous treatment arms at a time. Further, information from experiments lacking these two treatment arms is not used. Mixed treatment comparisons meta-analysis, also known as network meta-analysis, can be used instead to compare all available treatment arms at once. This chapter will explain, explore, and compare two frequentist methods that exist to apply network meta-analysis. We focus on sets of experiments with designs typically found in preclinical experiments. We also use simulations to compare network meta-analysis results to those given by mixed-effect linear models for these types of experiments. An R package was also developed to perform both methods of network meta-analysis.
3) Power calculations for hypothesis tests play a critical role in conducting both clinical and non-clinical trials. Many programs exist to calculate the power for popular hypothesis tests, such as Student's t-test for hypothesis tests analyzing continuous data or the log-rank test for hypothesis tests analyzing survival data. Calculating the power for hypothesis tests analyzing ordinal categorical data can be much more complicated. For data such as this, observations are given in the form of scores on a scale with a small range, typically between three and five points. The data is assumed to be distributed according to a multinomial distribution which can depend on many parameters. We propose a simple yet effective method for defining alternative multinomial distributions and performing power calculations by creating and shifting quantiles of the standard normal distribution. We offer simulation results and apply the method to a dataset. An R package was also developed to use this method.
KW - Statistics and Biostatistics
KW - Experiments -- Data processing
LA - English
ER -