Abstract
(type = abstract)
Background: Our understanding of Acute Myeloid Leukemia (AML) has transformed over the recent years. We have yet to tackle an ongoing major challenge of high mortality rate in elderly AML. AML outcome differs usually depending on patient age, predisposing genomics variations (i.e., chromosomal abnormalities, mutations, gene expression profile, epigenomic patterns, and possibly aberrant mRNA splicing), infectious complications, severe bleeding, and complications after bone marrow/stem cell transplant. Thus far, available AML risk assessment systems mainly rely on the well-established prognostic indicators in a form of chromosomal aberrations, and a few driver mutations often in patients with normal cytogenetics. Although these systems demonstrated acceptable performance in separating favorable and adverse groups, they faced limitation in defining patients with intermediate risk status. Pre-mRNA splicing regulation is a tissue dependent process, plays an important role in hematopoiesis including proliferation and differentiation. Several studies on a select number of genes reported expression of spliced variants with clinical implications in AML. Yet, a systematic approach to investigate clinical relevance of alternative spliced (AS) variants, and their capacity to predict disease outcome in AML is lacking.
Objectives: (i) To identify genes with AS variant (signature event) with capacity of predicting disease outcome in adult de novo AML (defined as AML in patients without history of antecedent hematologic disorder or treatment with cytotoxic reagents), outperforming a standard model built on the well-established AML prognostic risk factors; (ii) to evaluate capacity of signature events to serve as prognostic indicators in adult AML; and (iii) to distinguish common cis regulatory modules in genes with signature events.
Methods: We employed available bioinformatics, machine learning, and statistical techniques to build two models: (i) a standard Cox proportional hazards (PH) model (referred to as S-Cox) fit to the well-established AML prognostic risk factors, i.e., age, cytogenetic and molecular risk status, and total peripheral blood white blood cell (WBC) counts at diagnosis; and (ii) a Cox PH model with the grouped lasso penalty (referred to as GL-Cox) built on age, cytogenetic and molecular risk status, WBC count, and Percent Spliced In (PSI) value of alternative exons. Overall survival was considered as clinical endpoint and death as event. We validated performance of these models by calculating area under time-dependent Receiver Operating Characteristic curve (AUROCt).
Results: We developed our models on a training set (TS) of fifty-four adult de novo AML cases participated in the Cancer Genome Atlas (TCGA) study. Two non-overlapping validation sets (VSs) from TCGA cohort (n=25 and n=44) were used to evaluate model performance. Patients included in the TS and the VS-1 were treated with similar initial therapy. Patients in the VS-2 had a history of prior treatment with Hydrea (to reduce WBC counts) and were treated with different types of therapy. The GL-Cox model identified 19 signature events with improved prediction power compared to the S-Cox model. Time-dependent ROC curve at 1-5 years survival for the GL-Cox model dominated the ROC curve for the S-Cox model in two VSs. These signature events belonged to genes including CLK4 (exon 5) a splicing regulator, MCPH1 (exons 5:6) a tumor suppressor gene, RFWD2 (exons 10:8) a gene that encodes a ubiquitin-protein ligase to target and degrade different proteins including TP53 and JUN, and ABCB7 (exons 5:4) that involved in iron homeostasis and heme transportation, among others. Furthermore, we found that of the 19 genes with signature event, 12 had at least one CTCF-binding module, a regulatory element involved in alternative exon inclusion by pausing RNA polymerase II.
Conclusion: This study for the first time demonstrated capacity of mis-spliced transcripts in predicting disease outcome in adult AML, and their potential to serve as prognostic indicators. Presence of alternatively spliced CLK4 among the signature events suggested a possible role for this trans-splicing factor in global regulation of AS in adult AML. In addition, existence of CTCF-binding module in more than half of these signature events and very close to NCDN and RFWD2 target exons indicated a possible role for this regulatory element in mediating exon inclusion. Despite promising results of our study, we faced several limitations including small sample size and access to limited clinical data (e.g. time of transplant, and cause of death). We also did not evaluate our model on an independent patient cohort. Therefore, further investigation is essential to draw a more reliable conclusion.