Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms

IntroductionPigmented rice is fascinated by consumers for its abundant phytochemicals and unique aroma.MethodsIn this study, GC–MS-based metabolomics of Yangxian colored rice varieties were performed to characterize their volatile metabolites through multivariate statistics and machine learning algo...

Full description

Saved in:
Bibliographic Details
Main Authors: Kaiqi Cheng, Ruonan Dong, Fei Pan, Wen Su, Lingjie Xi, Meng Zhang, Jingzhang Geng, Ruichang Gao, Wengang Jin, A. M. Abd El-Aty
Format: Article
Language:English
Published: Frontiers Media S.A. 2025-05-01
Series:Frontiers in Nutrition
Subjects:
Online Access:https://www.frontiersin.org/articles/10.3389/fnut.2025.1598875/full
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849736494505263104
author Kaiqi Cheng
Ruonan Dong
Fei Pan
Wen Su
Lingjie Xi
Meng Zhang
Jingzhang Geng
Ruichang Gao
Ruichang Gao
Wengang Jin
A. M. Abd El-Aty
A. M. Abd El-Aty
author_facet Kaiqi Cheng
Ruonan Dong
Fei Pan
Wen Su
Lingjie Xi
Meng Zhang
Jingzhang Geng
Ruichang Gao
Ruichang Gao
Wengang Jin
A. M. Abd El-Aty
A. M. Abd El-Aty
author_sort Kaiqi Cheng
collection DOAJ
description IntroductionPigmented rice is fascinated by consumers for its abundant phytochemicals and unique aroma.MethodsIn this study, GC–MS-based metabolomics of Yangxian colored rice varieties were performed to characterize their volatile metabolites through multivariate statistics and machine learning algorithms.ResultsResults showed that a total of 357 volatile metabolites were detected and segmented into 9 groups, including 96 organooxygen compounds (26.89%), 52 carboxylic acids and derivatives (14.57%), 42 fatty acyls (11.76%), 16 benzene and substituted derivatives (4.48%), and 11 hydroxy acids and derivatives (3.08%). Multivariate statistics screened 127 differentially abundant metabolites via PLS-DA. Principal component analysis revealed that the percentages of PC1 and PC2 were 52.48% and 27.09%, respectively. Based on differential metabolites with great multicollinearity above 0.8 and the chi-square test (20% feature numbers), only 7 metabolites were found to represent the overall metabolites among the several colored rice varieties. Four machine learning models were further used for the classification of various colored rice varieties, and random forest model was the optimum for predicting classification, with an accuracy of 0.97. Moreover, Shapley additive explanations analysis revealed that the 7 metabolites can be used as potential markers for representing the metabolomic profiles.ConclusionsThese results implied that GC–MS-based metabolomics combined with random forest might be effective for extracting key features among different pigmented rice varieties.
format Article
id doaj-art-b3da5fce41d54705a850357bdeefd15e
institution DOAJ
issn 2296-861X
language English
publishDate 2025-05-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Nutrition
spelling doaj-art-b3da5fce41d54705a850357bdeefd15e2025-08-20T03:07:16ZengFrontiers Media S.A.Frontiers in Nutrition2296-861X2025-05-011210.3389/fnut.2025.15988751598875Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithmsKaiqi Cheng0Ruonan Dong1Fei Pan2Wen Su3Lingjie Xi4Meng Zhang5Jingzhang Geng6Ruichang Gao7Ruichang Gao8Wengang Jin9A. M. Abd El-Aty10A. M. Abd El-Aty11Qinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaInstitute of Apicultural Research, Chinese Academy of Agricultural Sciences, Beijing, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaCollege of Food and Biotechnology, Jiangsu University, Zhenjiang, ChinaQinba State Key Laboratory of Biological Resources and Ecological Environment, QinLing-Bashan Moun-tains Bioresources Comprehensive Development 2011 C. I. C, Shaanxi Province Key Laboratory of Bio-Resources, College of Bioscience and Bioengineering Shaanxi University of Technology, Hanzhong, ChinaDepartment of Pharmacology, Faculty of Veterinary Medicine, Cairo University, Giza, EgyptDepartment of Medical Pharmacology, Medical Faculty, Ataturk University, Erzurum, TürkiyeIntroductionPigmented rice is fascinated by consumers for its abundant phytochemicals and unique aroma.MethodsIn this study, GC–MS-based metabolomics of Yangxian colored rice varieties were performed to characterize their volatile metabolites through multivariate statistics and machine learning algorithms.ResultsResults showed that a total of 357 volatile metabolites were detected and segmented into 9 groups, including 96 organooxygen compounds (26.89%), 52 carboxylic acids and derivatives (14.57%), 42 fatty acyls (11.76%), 16 benzene and substituted derivatives (4.48%), and 11 hydroxy acids and derivatives (3.08%). Multivariate statistics screened 127 differentially abundant metabolites via PLS-DA. Principal component analysis revealed that the percentages of PC1 and PC2 were 52.48% and 27.09%, respectively. Based on differential metabolites with great multicollinearity above 0.8 and the chi-square test (20% feature numbers), only 7 metabolites were found to represent the overall metabolites among the several colored rice varieties. Four machine learning models were further used for the classification of various colored rice varieties, and random forest model was the optimum for predicting classification, with an accuracy of 0.97. Moreover, Shapley additive explanations analysis revealed that the 7 metabolites can be used as potential markers for representing the metabolomic profiles.ConclusionsThese results implied that GC–MS-based metabolomics combined with random forest might be effective for extracting key features among different pigmented rice varieties.https://www.frontiersin.org/articles/10.3389/fnut.2025.1598875/fullpigmented ricemetabolitesmultivariate statisticsmachine learningvolatiles
spellingShingle Kaiqi Cheng
Ruonan Dong
Fei Pan
Wen Su
Lingjie Xi
Meng Zhang
Jingzhang Geng
Ruichang Gao
Ruichang Gao
Wengang Jin
A. M. Abd El-Aty
A. M. Abd El-Aty
Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
Frontiers in Nutrition
pigmented rice
metabolites
multivariate statistics
machine learning
volatiles
title Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
title_full Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
title_fullStr Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
title_full_unstemmed Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
title_short Characterization and feature selection of volatile metabolites in Yangxian pigmented rice varieties through GC-MS and machine learning algorithms
title_sort characterization and feature selection of volatile metabolites in yangxian pigmented rice varieties through gc ms and machine learning algorithms
topic pigmented rice
metabolites
multivariate statistics
machine learning
volatiles
url https://www.frontiersin.org/articles/10.3389/fnut.2025.1598875/full
work_keys_str_mv AT kaiqicheng characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT ruonandong characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT feipan characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT wensu characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT lingjiexi characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT mengzhang characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT jingzhanggeng characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT ruichanggao characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT ruichanggao characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT wengangjin characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT amabdelaty characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms
AT amabdelaty characterizationandfeatureselectionofvolatilemetabolitesinyangxianpigmentedricevarietiesthroughgcmsandmachinelearningalgorithms