BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification

The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine l...

Full description

Saved in:
Bibliographic Details
Main Authors: Anderson P. Avila Santos, Breno L. S. de Almeida, Robson P. Bonidia, Peter F. Stadler, Polonca Stefanic, Ines Mandic-Mulec, Ulisses Rocha, Danilo S. Sanches, André C.P.L.F. de Carvalho
Format: Article
Language:English
Published: Taylor & Francis Group 2024-12-01
Series:RNA Biology
Subjects:
Online Access:https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850061940934574080
author Anderson P. Avila Santos
Breno L. S. de Almeida
Robson P. Bonidia
Peter F. Stadler
Polonca Stefanic
Ines Mandic-Mulec
Ulisses Rocha
Danilo S. Sanches
André C.P.L.F. de Carvalho
author_facet Anderson P. Avila Santos
Breno L. S. de Almeida
Robson P. Bonidia
Peter F. Stadler
Polonca Stefanic
Ines Mandic-Mulec
Ulisses Rocha
Danilo S. Sanches
André C.P.L.F. de Carvalho
author_sort Anderson P. Avila Santos
collection DOAJ
description The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification. This study presents BioDeepFuse, a hybrid deep learning framework integrating convolutional neural networks (CNN) or bidirectional long short-term memory (BiLSTM) networks with handcrafted features for enhanced accuracy. This framework employs a combination of k-mer one-hot, k-mer dictionary, and feature extraction techniques for input representation. Extracted features, when embedded into the deep network, enable optimal utilization of spatial and sequential nuances of ncRNA sequences. Using benchmark datasets and real-world RNA samples from bacterial organisms, we evaluated the performance of BioDeepFuse. Results exhibited high accuracy in ncRNA classification, underscoring the robustness of our tool in addressing complex ncRNA sequence data challenges. The effective melding of CNN or BiLSTM with external features heralds promising directions for future research, particularly in refining ncRNA classifiers and deepening insights into ncRNAs in cellular processes and disease manifestations. In addition to its original application in the context of bacterial organisms, the methodologies and techniques integrated into our framework can potentially render BioDeepFuse effective in various and broader domains.
format Article
id doaj-art-2725291de7df4245ba84e26d93cbc716
institution DOAJ
issn 1547-6286
1555-8584
language English
publishDate 2024-12-01
publisher Taylor & Francis Group
record_format Article
series RNA Biology
spelling doaj-art-2725291de7df4245ba84e26d93cbc7162025-08-20T02:50:03ZengTaylor & Francis GroupRNA Biology1547-62861555-85842024-12-0121141042110.1080/15476286.2024.2329451BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classificationAnderson P. Avila Santos0Breno L. S. de Almeida1Robson P. Bonidia2Peter F. Stadler3Polonca Stefanic4Ines Mandic-Mulec5Ulisses Rocha6Danilo S. Sanches7André C.P.L.F. de Carvalho8Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilDepartment of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony, GermanyDepartment of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, SloveniaDepartment of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, SloveniaDepartment of Applied Microbial Ecology, Helmholtz Centre for Environmental Research – UFZ GmbH, Leipzig, Saxony, GermanyDepartment of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilThe accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification. This study presents BioDeepFuse, a hybrid deep learning framework integrating convolutional neural networks (CNN) or bidirectional long short-term memory (BiLSTM) networks with handcrafted features for enhanced accuracy. This framework employs a combination of k-mer one-hot, k-mer dictionary, and feature extraction techniques for input representation. Extracted features, when embedded into the deep network, enable optimal utilization of spatial and sequential nuances of ncRNA sequences. Using benchmark datasets and real-world RNA samples from bacterial organisms, we evaluated the performance of BioDeepFuse. Results exhibited high accuracy in ncRNA classification, underscoring the robustness of our tool in addressing complex ncRNA sequence data challenges. The effective melding of CNN or BiLSTM with external features heralds promising directions for future research, particularly in refining ncRNA classifiers and deepening insights into ncRNAs in cellular processes and disease manifestations. In addition to its original application in the context of bacterial organisms, the methodologies and techniques integrated into our framework can potentially render BioDeepFuse effective in various and broader domains.https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451Non-coding RNAdeep learningneural networksRNA identificationfeature extractionmodel performance
spellingShingle Anderson P. Avila Santos
Breno L. S. de Almeida
Robson P. Bonidia
Peter F. Stadler
Polonca Stefanic
Ines Mandic-Mulec
Ulisses Rocha
Danilo S. Sanches
André C.P.L.F. de Carvalho
BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
RNA Biology
Non-coding RNA
deep learning
neural networks
RNA identification
feature extraction
model performance
title BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
title_full BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
title_fullStr BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
title_full_unstemmed BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
title_short BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
title_sort biodeepfuse a hybrid deep learning approach with integrated feature extraction techniques for enhanced non coding rna classification
topic Non-coding RNA
deep learning
neural networks
RNA identification
feature extraction
model performance
url https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451
work_keys_str_mv AT andersonpavilasantos biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT brenolsdealmeida biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT robsonpbonidia biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT peterfstadler biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT poloncastefanic biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT inesmandicmulec biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT ulissesrocha biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT danilossanches biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification
AT andrecplfdecarvalho biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification