Disease endotypes of type 1 diabetes: exploration through machine learning and topological data analysis
Citation & Export
Hide
Simple citation
Young, Kenneth G..
Disease endotypes of type 1 diabetes: exploration through machine learning and topological data analysis. Retrieved from
https://doi.org/doi:10.7282/t3-p7vb-fj32
Export
Description
TitleDisease endotypes of type 1 diabetes: exploration through machine learning and topological data analysis
Date Created2019
Other Date2019-05 (degree)
Extent1 online resource (xviii, 210 pages)
DescriptionBACKGROUND: Type 1 diabetes (T1D) is a complex autoimmune disease resulting in the destruction of β-cells encompassed by a combination of genotype and phenotype etiologies. With the etiological differences, high-dimensional multi-omics data, and stochastic components of T1D, a data-driven unsupervised machine learning and topology-based approach may identify new T1D endotypes that might go undetected with classical statistical approaches. Unsupervised machine learning, and topology-based approaches have found subtypes of various other diseases and disorders. Clustering techniques play a pivotal role in various elements of data analysis. They can provide important clues to the structure of data sets, signifying results and hypotheses of the underlying pathogenesis.
METHOD: This work builds upon published TEDDY results. TEDDY has shown several single nucleotide polymorphisms (SNPs) can distinguish IAA-only from GADA-only as the first appearing IA and early exposures (infectious episodes) influence both in different ways, depending on the genetic factors. This would strongly suggest there are at least two or more different endotypes. What is unknown is the specific biological pathways that would explain the observable diversity in IA phenotypes, first appearing and progression. Unsupervised cluster and topology based analytical analyses of the T1D cases may distinguish phenotypes and help generate hypotheses regarding the biological pathways. Hierarchical cluster analysis was used for this particular analysis. We performed various analyses on the data: the first clustering with eight agglomerative hierarchical clustering methods; ward.D, ward.D2, average, complete, single, mcquitty, median, and centroid. We additionally performed k-means clustering, model based clustering, and topological data analysis.
RESULTS: This was an exploratory study conducted to investigate the classification of T1D patient populations into distinct endotypes through procedures that utilize unsupervised machine learning techniques (hierarchical) and external validation through k-means clustering, model based clustering, and topological data analysis (TDA). The research analyzed data from a case-control cohort of genetically at risk study participants from the TEDDY study to explore the possibility of T1D endotypes. These study participants, enrolled from birth, carry HLA-susceptibility genotypes for development of islet autoantibodies (IA) and T1D. A novel exploratory approach to classify disease endotypes is presented in this study. By means of hierarchical clustering methods and exploration, the results of this study suggest that classification of T1D patient populations is plausible. This study identified three distinct clusters of T1D diagnosed patients among genetically at risk individuals who carry HLA-susceptibility genotypes for development of islet autoantibodies (IA) and T1D.
CONCLUSION: While this exploratory study has limitations, the novel methodical approach taken to identify possible endotypes through clustering can be used to further advance the understanding of T1D. The results of this study found three distinct clusters, they do not confirm different etiologies of diabetes or that this clustering methodology is the optimum classification of diabetes endotypes. With additional data and a larger population, it might be possible to improve the classification further through the addition of more cluster variables.
NotePh.D.
NoteIncludes bibliographical references
Genretheses, ETD doctoral
LanguageEnglish
CollectionSchool of Health Professions ETD Collection
Organization NameRutgers, The State University of New Jersey
RightsThe author owns the copyright to this work.