Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data
Abstract The accumulation of large datasets by the scientific community has surpassed the capacity of traditional processing methods, underscoring the critical need for innovative and efficient algorithms capable of navigating through extensive existing experimental data. Addressing this challenge,...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2025-03-01
|
| Series: | Nature Communications |
| Online Access: | https://doi.org/10.1038/s41467-025-56905-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850039932729425920 |
|---|---|
| author | Konstantin S. Kozlov Daniil A. Boiko Julia V. Burykina Valentina V. Ilyushenkova Alexander Y. Kostyukovich Ekaterina D. Patil Valentine P. Ananikov |
| author_facet | Konstantin S. Kozlov Daniil A. Boiko Julia V. Burykina Valentina V. Ilyushenkova Alexander Y. Kostyukovich Ekaterina D. Patil Valentine P. Ananikov |
| author_sort | Konstantin S. Kozlov |
| collection | DOAJ |
| description | Abstract The accumulation of large datasets by the scientific community has surpassed the capacity of traditional processing methods, underscoring the critical need for innovative and efficient algorithms capable of navigating through extensive existing experimental data. Addressing this challenge, our study introduces a machine learning (ML)-powered search engine specifically tailored for analyzing tera-scale high-resolution mass spectrometry (HRMS) data. This engine harnesses a novel isotope-distribution-centric search algorithm augmented by two synergistic ML models, assisting with the discovery of hitherto unknown chemical reactions. This methodology enables the rigorous investigation of existing data, thus providing efficient support for chemical hypotheses while reducing the need for conducting additional experiments. Moreover, we extend this approach with baseline methods for automated reaction hypothesis generation. In its practical validation, our approach successfully identified several reactions, unveiling previously undescribed transformations. Among these, the heterocycle-vinyl coupling process within the Mizoroki-Heck reaction stands out, highlighting the capability of the engine to elucidate complex chemical phenomena. |
| format | Article |
| id | doaj-art-ac78c308794d4c5a8d91b7a02a11bc79 |
| institution | DOAJ |
| issn | 2041-1723 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | Nature Portfolio |
| record_format | Article |
| series | Nature Communications |
| spelling | doaj-art-ac78c308794d4c5a8d91b7a02a11bc792025-08-20T02:56:12ZengNature PortfolioNature Communications2041-17232025-03-0116111210.1038/s41467-025-56905-8Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry dataKonstantin S. Kozlov0Daniil A. Boiko1Julia V. Burykina2Valentina V. Ilyushenkova3Alexander Y. Kostyukovich4Ekaterina D. Patil5Valentine P. Ananikov6Zelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesZelinsky Institute of Organic Chemistry, Russian Academy of SciencesAbstract The accumulation of large datasets by the scientific community has surpassed the capacity of traditional processing methods, underscoring the critical need for innovative and efficient algorithms capable of navigating through extensive existing experimental data. Addressing this challenge, our study introduces a machine learning (ML)-powered search engine specifically tailored for analyzing tera-scale high-resolution mass spectrometry (HRMS) data. This engine harnesses a novel isotope-distribution-centric search algorithm augmented by two synergistic ML models, assisting with the discovery of hitherto unknown chemical reactions. This methodology enables the rigorous investigation of existing data, thus providing efficient support for chemical hypotheses while reducing the need for conducting additional experiments. Moreover, we extend this approach with baseline methods for automated reaction hypothesis generation. In its practical validation, our approach successfully identified several reactions, unveiling previously undescribed transformations. Among these, the heterocycle-vinyl coupling process within the Mizoroki-Heck reaction stands out, highlighting the capability of the engine to elucidate complex chemical phenomena.https://doi.org/10.1038/s41467-025-56905-8 |
| spellingShingle | Konstantin S. Kozlov Daniil A. Boiko Julia V. Burykina Valentina V. Ilyushenkova Alexander Y. Kostyukovich Ekaterina D. Patil Valentine P. Ananikov Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data Nature Communications |
| title | Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data |
| title_full | Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data |
| title_fullStr | Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data |
| title_full_unstemmed | Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data |
| title_short | Discovering organic reactions with a machine-learning-powered deciphering of tera-scale mass spectrometry data |
| title_sort | discovering organic reactions with a machine learning powered deciphering of tera scale mass spectrometry data |
| url | https://doi.org/10.1038/s41467-025-56905-8 |
| work_keys_str_mv | AT konstantinskozlov discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT daniilaboiko discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT juliavburykina discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT valentinavilyushenkova discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT alexanderykostyukovich discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT ekaterinadpatil discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata AT valentinepananikov discoveringorganicreactionswithamachinelearningpowereddecipheringofterascalemassspectrometrydata |