Tunable biclustering algorithm for analyzing large gene expression data sets

Singh, Amartya

doi:doi:10.7282/t3-ys8f-ct92

RUcore: Rutgers University Community Repository

Search
- All
- Text
- Images
- Audio
- Video
Advanced Search | Help

Search all content in all RUcore collections.
Services
Collections

Help Contact Us My Account

Home

Resource

Tunable biclustering algorithm for analyzing large gene expression data sets

PDF

PDF format is widely accepted and good for printing.

Plug-in required

PDF-1(20.73 MB)

Citation & Export

View Usage Statistics

Staff View

Citation & Export
Hide

Simple citation

Singh, Amartya. Tunable biclustering algorithm for analyzing large gene expression data sets. Retrieved from https://doi.org/doi:10.7282/t3-ys8f-ct92

Export

Click here for information about Citation Management Tools at Rutgers.

Statistics
Hide

Description

TitleTunable biclustering algorithm for analyzing large gene expression data sets

NameSingh, Amartya (author); Khiabanian, Hossein (chair); Bhanot, Gyan (internal member); Croft, Mark (internal member); Morozov, Alexandre (internal member); De, Subhajyoti (outside member); Rutgers University; School of Graduate Studies

Date Created2019

Other Date2019-10 (degree)

SubjectPhysics and Astronomy, Biclustering, Gene expression -- Computer programs

Extent1 online resource (xiii, 103 pages) : illustrations

DescriptionTraditional clustering approaches for gene expression data are not well adapted to address the complexity and heterogeneity of tumors, where small sets of genes may be aberrantly co-expressed in specific subsets of tumors. Biclustering algorithms that perform local clustering on subsets of genes and conditions help address this problem. We have proposed a graph-based Tunable Biclustering Algorithm (TuBA) (Chapter 2) based on a novel pairwise proximity measure that leverages the size of the data sets to identify subsets of tumor samples that co-express subsets of genes at their highest or lowest levels relative to other samples.

We applied TuBA to three large gene expression datasets encompassing a total of 3,940 breast invasive carcinoma (BRCA) patients (Chapter 3). We demonstrated that there was significant agreement between the results obtained for each data set, and discovered that about 50% of the altered co-expression signatures were associated with a subtype of the disease that exhibits low levels of expression of the estrogen hormone receptor 1 (ER) and the human epidermal growth factor receptor 2 (HER2) genes. Tumors belonging to this subtype are labelled as ER-/HER2-. Since only 15% of all BRCA patients are estimated to have tumors that belong to this subtype, our algorithm was able to highlight the tremendous heterogeneity in alterations within tumors of this subtype. Quite significantly, more than 50% of these signatures were associated with alterations in the DNA that results in amplification (or deletion) of genes’ copies, which subsequently result in higher (or lower) level of gene expression. Thus, TuBA was especially effective in identifying transcriptionally active copy number variations in tumor samples. Finally, TuBA identified biclusters that were associated with the tumor microenvironment, which included biclusters associated with infiltrating immune and stromal cells. These can improve our understanding about the role played by the microenvironment in modulating tumor progression.

We showed that TuBA outperforms other algorithms in identification of co-expressed genes located in transcriptionally active copy number altered sites (Chapter 4). Moreover, from a differential co-expression perspective, TuBA offers an advantage over other methods since no prior specification of subsets of samples (conditions) is necessary; the nature of our proximity measure ensures that such differential co-expression signatures are preferentially identified.

In summary, our method identified a multitude of altered transcriptional profiles associated with the tremendous heterogeneity of diseased states in breast cancer. Exploring the diversity of these aberrant signatures can help identify potential biomarkers of clinical relevance that can further improve treatment outcomes, especially for ER-/HER2- breast cancers. Although transcriptomic alterations are not the ultimate determinants of progression of disease, our algorithm holds the promise to improve therapeutic selection and design by identifying significantly altered transcriptional patterns associated with tumors.

NotePh.D.

NoteIncludes bibliographical references

Genretheses, ETD doctoral

Persistent URLhttps://doi.org/doi:10.7282/t3-ys8f-ct92

LanguageEnglish

CollectionSchool of Graduate Studies Electronic Theses and Dissertations

Organization NameRutgers, The State University of New Jersey

RightsThe author owns the copyright to this work.

Version 8.5.5

Citation & ExportHide

Simple citation

Export

StatisticsHide

Description

Citation & Export
Hide

Statistics
Hide