Alawneh, Jafar. The use of microbial genome mining for in silico discovery of novel secondary metabolite gene clusters. Retrieved from https://doi.org/doi:10.7282/t3-sz99-qh48
DescriptionSecondary metabolites (SMs) are small organic molecules that have various biological functions and produced by bacteria, fungi, archaea, and plants. Because of their diverse structures, different SMs have been shown to have antibacterial, antifungal, antiviral, and anticancer activities. Nonribosomal peptides (NRPs) and polyketides (PKs) are two diverse classes of SMs that are produced by nonribosomal peptide synthetase (NRPS) and polyketide synthase (PKS), respectively. One class of SMs, the epothilones, were discovered in the soil bacteria S. cellulosum and some epothilones have been shown to have antitumor activities similar to the taxanes. There is significant interest in expanding the available pool of structurally unique epothilones and other SMs as therapeutic candidates, however the distribution and structural variations in the microbial genomic landscape is currently poorly understood.
In this study, genome mining was used to find Epothilone-similar gene clusters (ESGCs) and other SMs gene clusters that potentially encode Epothilone-similar compounds and novel SMs. The sequences of genes (epo A-F) forming the S. cellulosum So ce90 Epothilone gene cluster were initially used to find epo A-F homologs (EAFHs) in other bacteria. These homologs were used to find gene clusters (ESGCs), and these newly discovered gene clusters were subsequently used to screen bacterial species and strains to find currently unidentified ESGCs and hybrid PKS-NRPS gene clusters that potentially encode novel SMs. The gene clusters identified in this study can be divided into three groups: 1) ESGCs highly similar to the Epothilone gene cluster and likely produce epothilone variants; 2) Gene clusters highly similar to those that encode genes which produce other secondary metabolites; 3) Gene clusters that showed relatively low similarities with secondary metabolite gene clusters. Many of these gene clusters are reported for the first time in this study. Further, a number of EAFHs identified in this study were used for in silico design of ESGCs, which resulted in new gene clusters that could produce novel Epothilone-similar compounds with predictable molecular structures. These results suggest that directed manipulation of modular EGSC components is a viable approach to producing a large number of new secondary metabolites for testing against pathogens and cancer cells.