Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data
Abstract In this study, we propose a neural network- based approach to analyze IR spectra and detect the presence of functional groups. Our neural network architecture is based on the concept of learning split representations. We demonstrate that our method achieves favorable validation performance...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-02-01
|
| Series: | Journal of Cheminformatics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13321-025-00960-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850185315235397632 |
|---|---|
| author | Dev Punjabi Yu-Chieh Huang Laura Holzhauer Pierre Tremouilhac Pascal Friederich Nicole Jung Stefan Bräse |
| author_facet | Dev Punjabi Yu-Chieh Huang Laura Holzhauer Pierre Tremouilhac Pascal Friederich Nicole Jung Stefan Bräse |
| author_sort | Dev Punjabi |
| collection | DOAJ |
| description | Abstract In this study, we propose a neural network- based approach to analyze IR spectra and detect the presence of functional groups. Our neural network architecture is based on the concept of learning split representations. We demonstrate that our method achieves favorable validation performance using the NIST dataset. Furthermore, by incorporating additional data from the open-access research data repository Chemotion, we show that our model improves the classification performance for nitriles and amides. Scientific contribution: Our method exclusively uses IR data as input for a neural network, making its performance, unlike other well-performing models, independent of additional data types obtained from analytical measurements. Furthermore, our proposed method leverages a deep learning model that outperforms previous approaches, achieving F1 scores above 0.7 to identify 17 functional groups. By incorporating real-world data from various laboratories, we demonstrate how open-access, specialized research data repositories can serve as yet unexplored, valuable benchmark datasets for future machine learning research. |
| format | Article |
| id | doaj-art-285a00e3c4b140fa878fbd5773ff7fbd |
| institution | OA Journals |
| issn | 1758-2946 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | BMC |
| record_format | Article |
| series | Journal of Cheminformatics |
| spelling | doaj-art-285a00e3c4b140fa878fbd5773ff7fbd2025-08-20T02:16:45ZengBMCJournal of Cheminformatics1758-29462025-02-0117111310.1186/s13321-025-00960-2Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world dataDev Punjabi0Yu-Chieh Huang1Laura Holzhauer2Pierre Tremouilhac3Pascal Friederich4Nicole Jung5Stefan Bräse6Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Institute of Theoretical Informatics, Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Institute of Biological and Chemical Systems, Karlsruhe Institute of Technology (KIT)Abstract In this study, we propose a neural network- based approach to analyze IR spectra and detect the presence of functional groups. Our neural network architecture is based on the concept of learning split representations. We demonstrate that our method achieves favorable validation performance using the NIST dataset. Furthermore, by incorporating additional data from the open-access research data repository Chemotion, we show that our model improves the classification performance for nitriles and amides. Scientific contribution: Our method exclusively uses IR data as input for a neural network, making its performance, unlike other well-performing models, independent of additional data types obtained from analytical measurements. Furthermore, our proposed method leverages a deep learning model that outperforms previous approaches, achieving F1 scores above 0.7 to identify 17 functional groups. By incorporating real-world data from various laboratories, we demonstrate how open-access, specialized research data repositories can serve as yet unexplored, valuable benchmark datasets for future machine learning research.https://doi.org/10.1186/s13321-025-00960-2Infrared spectraMachine learningData analysisOpen databases |
| spellingShingle | Dev Punjabi Yu-Chieh Huang Laura Holzhauer Pierre Tremouilhac Pascal Friederich Nicole Jung Stefan Bräse Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data Journal of Cheminformatics Infrared spectra Machine learning Data analysis Open databases |
| title | Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data |
| title_full | Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data |
| title_fullStr | Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data |
| title_full_unstemmed | Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data |
| title_short | Infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real-world data |
| title_sort | infrared spectrum analysis of organic molecules with neural networks using standard reference data sets in combination with real world data |
| topic | Infrared spectra Machine learning Data analysis Open databases |
| url | https://doi.org/10.1186/s13321-025-00960-2 |
| work_keys_str_mv | AT devpunjabi infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT yuchiehhuang infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT lauraholzhauer infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT pierretremouilhac infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT pascalfriederich infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT nicolejung infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata AT stefanbrase infraredspectrumanalysisoforganicmoleculeswithneuralnetworksusingstandardreferencedatasetsincombinationwithrealworlddata |