Protecting key crops and their yields is a significant challenge in light of climate change. Plants' health and ability to defend against pathogens and herbivores is augmented by symbiosis with microorganisms. The latter exert their effects through production of small molecule metabolites which participate in nutrient exchange and affect signaling pathways in the plant. While liquid chromatography - tandem mass spectrometry (LC-MS/MS) has made great strides in identifications of small molecules from biological samples, large fractions of experimental readouts remain unresolved. Accurate identification of bioactive molecules and their microbial producers would provide great benefits in optimizing local crop production.
Machine learning models have enhanced our ability to predict molecular identities from mass spectra, but depend on high resolution information that is only available for a small subset of masses. Here, we propose a machine learning system for untargeted metabolomics, that would design advanced feature extraction methods and employ probabilistic models to predict molecular fingerprints. To this end, the physical laws governing LC-MS/MS can be used to infer dependencies between detected masses, resolve redundancies and provide informative descriptions to underpin machine learning. Predictions are to be endowed with uncertainty, improving the reliability and enabling active learning - targeted improvement of model accuracy as well as searching for compound classes of interest, such as natural products. The proposed models can thus provide a useful guide in sourcing additional commercial standards or targeted data acquisition.
While metabolites are key intermediates in defense-related signaling events, the contribution of host and symbiont often remains unresolved. Microorganism monocultures and metabolomics can provide a reliable estimate of metabolic potential that is complementary to common transcriptomics and genomics based approaches. The predicted molecular fingerprints will provide a common representation of metabolomes from monocultures of plant-associated microorganisms, including growth-promoting bacteria and fungi. Curation of existing samples from more than 75 soil and plant microorganisms will result in a metabolome atlas that can reveal efficient producers of known bioactive molecules or novel analogs acting on known signaling pathways. This plant-microorganism metabolome atlas will also provide a starting point for acquisition and comparison of new metabolomics samples to be collected in our facility, and facilitate data sharing and exploratory analysis.
Our past work has identified molecules enabling chemical induction of defense priming phenotypes in model organisms (Arabidopsis thaliana), molecules from growth-promoting fungi of tomato (Solanum lycopersicum) and microorganisms promoting potato growth (Solanum tuberosum). In all of these cases, defense priming metabolomics signatures remain to be fully characterized, which will be enabled using the proposed models. Prioritizing most prominent bioactive molecules in defense signatures can reveal novel biostimulants, and combining with the plant-microorganism metabolome atlas, their potential efficient producers. Together, we aim to provide a streamlined pipeline for handling molecular data in plant growth promoting microorganisms.
SICRIS