Back To Schedule
Tuesday, July 28 • 18:15 - 18:20
VarQuest+: modification-tolerant database search of secondary metabolites mass spectra

Log in to save this to your schedule, view media, leave feedback and see who's attending!

Secondary metabolites (SMs) are at the center of attention for a wide range of researchers from biologists and ecologists to pharmacologists and biomedical scientists [1]. Modern mass spectrometry instruments allow rapid and low-cost scanning of thousands of metabolites which result in huge amounts of high-resolution data. Although this data represents a gold mine for future discoveries, its interpretation remains a bottleneck and requires appropriate computational methods [2]. The current software is either limited to specific classes of SMs, for example, peptidic natural products (VarQuest [3]), or can perform only standard database search which allows identification of known SMs but fails to discover their novel variants (Dereplicator+ [4]).

Here we present VarQuest+, a database search tool capable of identifying novel variants of a wide range of known SMs including polyketides, alkaloids, flavonoids, saponins, and many others. Algorithmic and software innovations in VarQuest+ make it much more efficient in the running time and memory consumption in comparison to existing analogs. This efficiency allowed the implementation of modification-tolerant search mode in VarQuest+, which is more challenging than a regular database search.

We benchmarked VarQuest+ on a Korean medical plants dataset (2.5 millions of mass spectra collected on 337 samples). The standard search of the KNApSAcK database (51,179 plant SMs [5]) resulted in the identification of 349 compounds. VarQuest+ modification-tolerant search identified 4,253 SMs, an order of magnitude more than Dereplicator+. Using the same search parameters, VarQuest+ is twenty times more efficient than Dereplicator+ in runtime, and four times more memory efficient.

The reported study was funded by RFBR, project number 20-04-01096.

[1] Cragg, G. M., & Newman, D. J. (2013) Natural products: a continuing source of novel drug leads. Biochimica et Biophysica Acta (BBA)-General Subjects, 1830(6), 3670-3695.
[2] Wang, M. et al. (2016) Sharing and community curation of mass spectrometry data with Global Natural Products Social molecular networking. Nat. Biotechnol., 34, 828.
[3] Gurevich, A. et al. (2018) Increased diversity of peptidic natural products revealed by modification-tolerant database search of mass spectra. Nat. Microbiol., 3, 319.
[4] Mohimani, H., et al (2018) Dereplication of microbial metabolites through database search of mass spectra. Nat. comm., 9:4035
[5] Afendi, F.M. et al (2012) KNApSAcK Family Databases: Integrated Metabolite–Plant Species Databases for Multifaceted Plant Research. Plant and Cell Physiology, 53 (2), e1.

avatar for Alexey Gurevich

Alexey Gurevich

Senior Research Scientist, Center for Algorithmic Biotechnology, St. Petersburg State University, St. Petersburg, Russia
I am leading Natural Product Discovery research direction at CAB (http://cab.spbu.ru/research/antibiotics-discovery/). Together with the Center for Computational Mass Spectrometry at UCSD and Mohimani Lab at Carnegie Mellon University, we are creating software for identification of... Read More →

Tuesday July 28, 2020 18:15 - 18:20 MSK
Zoom Conference https://zoom.us/j/94321101353?pwd=QlJBb09uM0NVVnVyK0FkbTJ3Nkcrdz09