AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.

Hermetia illucens is an important insect resource. Studies have shown that exploring the effects of Cu2+-stressed on the growth and development of the Hermetia illucens genome holds significant scientific importance. There are three major challenges in the current studies of Hermetia illucens genomi...

Full description

Saved in:
Bibliographic Details
Main Authors: Rui Miao, Hao-Yang Yu, Bing-Jie Zhong, Hong-Xia Sun, Qiang Xia
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2025-02-01
Series:PLoS Computational Biology
Online Access:https://doi.org/10.1371/journal.pcbi.1012773
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850144453850824704
author Rui Miao
Hao-Yang Yu
Bing-Jie Zhong
Hong-Xia Sun
Qiang Xia
author_facet Rui Miao
Hao-Yang Yu
Bing-Jie Zhong
Hong-Xia Sun
Qiang Xia
author_sort Rui Miao
collection DOAJ
description Hermetia illucens is an important insect resource. Studies have shown that exploring the effects of Cu2+-stressed on the growth and development of the Hermetia illucens genome holds significant scientific importance. There are three major challenges in the current studies of Hermetia illucens genomic data analysis: firstly, the lack of available genomic data which limits researchers in Hermetia illucens genomic data analysis. Secondly, to the best of our knowledge, there are no Artificial Intelligence (AI) feature selection models designed specifically for Hermetia illucens genome. Unlike human genomic data, noise in Hermetia illucens data is a more serious problem. Third, how to choose those genes located in the pathway enrichment region. Existing models assume that each gene probe has the same priori weight. However, researchers usually pay more attention to gene probes which are in the pathway enrichment region. Based on the above challenges, we initially construct experiments and establish a new Cu2+-stressed Hermetia illucens growth genome dataset. Subsequently, we propose AWGE-ESPCA: an edge Sparse PCA model based on adaptive noise elimination regularization and weighted gene network. The AWGE-ESPCA model innovatively proposes an adaptive noise elimination regularization method, effectively addressing the noise challenge in Hermetia illucens genomic data. We also integrate the known gene-pathway quantitative information into the Sparse PCA(SPCA) framework as a priori knowledge, which allows the model to filter out the gene probes in pathway-rich regions as much as possible. Ultimately, this study conducts five independent experiments and compared four latest Sparse PCA models as well as representative supervised and unsupervised baseline models to validate the model performance. The experimental results demonstrate the superior pathway and gene selection capabilities of the AWGE-ESPCA model. Ablation experiments validate the role of the adaptive regularizer and network weighting module. To summarize, this paper presents an innovative unsupervised model for Hermetia illucens genome analysis, which can effectively help researchers identify potential biomarkers. In addition, we also provide a working AWGE - ESPCA model code in the address: https://github.com/yhyresearcher/AWGE_ESPCA.
format Article
id doaj-art-be06d27ebf8d443e9dbf229532107d4d
institution OA Journals
issn 1553-734X
1553-7358
language English
publishDate 2025-02-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Computational Biology
spelling doaj-art-be06d27ebf8d443e9dbf229532107d4d2025-08-20T02:28:22ZengPublic Library of Science (PLoS)PLoS Computational Biology1553-734X1553-73582025-02-01212e101277310.1371/journal.pcbi.1012773AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.Rui MiaoHao-Yang YuBing-Jie ZhongHong-Xia SunQiang XiaHermetia illucens is an important insect resource. Studies have shown that exploring the effects of Cu2+-stressed on the growth and development of the Hermetia illucens genome holds significant scientific importance. There are three major challenges in the current studies of Hermetia illucens genomic data analysis: firstly, the lack of available genomic data which limits researchers in Hermetia illucens genomic data analysis. Secondly, to the best of our knowledge, there are no Artificial Intelligence (AI) feature selection models designed specifically for Hermetia illucens genome. Unlike human genomic data, noise in Hermetia illucens data is a more serious problem. Third, how to choose those genes located in the pathway enrichment region. Existing models assume that each gene probe has the same priori weight. However, researchers usually pay more attention to gene probes which are in the pathway enrichment region. Based on the above challenges, we initially construct experiments and establish a new Cu2+-stressed Hermetia illucens growth genome dataset. Subsequently, we propose AWGE-ESPCA: an edge Sparse PCA model based on adaptive noise elimination regularization and weighted gene network. The AWGE-ESPCA model innovatively proposes an adaptive noise elimination regularization method, effectively addressing the noise challenge in Hermetia illucens genomic data. We also integrate the known gene-pathway quantitative information into the Sparse PCA(SPCA) framework as a priori knowledge, which allows the model to filter out the gene probes in pathway-rich regions as much as possible. Ultimately, this study conducts five independent experiments and compared four latest Sparse PCA models as well as representative supervised and unsupervised baseline models to validate the model performance. The experimental results demonstrate the superior pathway and gene selection capabilities of the AWGE-ESPCA model. Ablation experiments validate the role of the adaptive regularizer and network weighting module. To summarize, this paper presents an innovative unsupervised model for Hermetia illucens genome analysis, which can effectively help researchers identify potential biomarkers. In addition, we also provide a working AWGE - ESPCA model code in the address: https://github.com/yhyresearcher/AWGE_ESPCA.https://doi.org/10.1371/journal.pcbi.1012773
spellingShingle Rui Miao
Hao-Yang Yu
Bing-Jie Zhong
Hong-Xia Sun
Qiang Xia
AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
PLoS Computational Biology
title AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
title_full AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
title_fullStr AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
title_full_unstemmed AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
title_short AWGE-ESPCA: An edge sparse PCA model based on adaptive noise elimination regularization and weighted gene network for Hermetia illucens genomic data analysis.
title_sort awge espca an edge sparse pca model based on adaptive noise elimination regularization and weighted gene network for hermetia illucens genomic data analysis
url https://doi.org/10.1371/journal.pcbi.1012773
work_keys_str_mv AT ruimiao awgeespcaanedgesparsepcamodelbasedonadaptivenoiseeliminationregularizationandweightedgenenetworkforhermetiaillucensgenomicdataanalysis
AT haoyangyu awgeespcaanedgesparsepcamodelbasedonadaptivenoiseeliminationregularizationandweightedgenenetworkforhermetiaillucensgenomicdataanalysis
AT bingjiezhong awgeespcaanedgesparsepcamodelbasedonadaptivenoiseeliminationregularizationandweightedgenenetworkforhermetiaillucensgenomicdataanalysis
AT hongxiasun awgeespcaanedgesparsepcamodelbasedonadaptivenoiseeliminationregularizationandweightedgenenetworkforhermetiaillucensgenomicdataanalysis
AT qiangxia awgeespcaanedgesparsepcamodelbasedonadaptivenoiseeliminationregularizationandweightedgenenetworkforhermetiaillucensgenomicdataanalysis