Finding structural motifs in Natural Products

Natural Products (NPs) have an unparalleled track record in pharmacology: most anticancer and antimicrobial agents are natural products or their derivatives. NPs are usually categorized into multiple classes based on their biosynthetic origin (Peptidic NP ­­ PNP, Polyketides, Alkaloids, etc). The amide bond is a well­known structural motif for PNPs. In order to better understand different classes of NPs, motifs in these classes should be found as well. This information can be further applied for automatic database annotation and for solving other problems like PNP dereplication.

We have found structural motifs for three most interesting classes of NPs: Non­Ribosomal Peptides (NRPs), Ribosomally Synthesized and Post­translationally Modified Peptides (RiPPs), and Polyketides. These motifs are well represented in one class and rarely occur in other classes. In addition, our research showed that nitrogens are less common in Polyketides than in two other classes. We have developed a tool using machine learning technique which can separate different classes of Natural Products based on structural motifs and other features. The tool was validated on a mix of NRP, RiPP and Polyketide structures and showed decent results with f1­score over 93% for all three classes.

   Денис Коноплев
   Алексей Гуревич
Время выполнения проекта: Feb 2016 — May 2016