BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification
The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine l...
Saved in:
| Main Authors: | , , , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Taylor & Francis Group
2024-12-01
|
| Series: | RNA Biology |
| Subjects: | |
| Online Access: | https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850061940934574080 |
|---|---|
| author | Anderson P. Avila Santos Breno L. S. de Almeida Robson P. Bonidia Peter F. Stadler Polonca Stefanic Ines Mandic-Mulec Ulisses Rocha Danilo S. Sanches André C.P.L.F. de Carvalho |
| author_facet | Anderson P. Avila Santos Breno L. S. de Almeida Robson P. Bonidia Peter F. Stadler Polonca Stefanic Ines Mandic-Mulec Ulisses Rocha Danilo S. Sanches André C.P.L.F. de Carvalho |
| author_sort | Anderson P. Avila Santos |
| collection | DOAJ |
| description | The accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification. This study presents BioDeepFuse, a hybrid deep learning framework integrating convolutional neural networks (CNN) or bidirectional long short-term memory (BiLSTM) networks with handcrafted features for enhanced accuracy. This framework employs a combination of k-mer one-hot, k-mer dictionary, and feature extraction techniques for input representation. Extracted features, when embedded into the deep network, enable optimal utilization of spatial and sequential nuances of ncRNA sequences. Using benchmark datasets and real-world RNA samples from bacterial organisms, we evaluated the performance of BioDeepFuse. Results exhibited high accuracy in ncRNA classification, underscoring the robustness of our tool in addressing complex ncRNA sequence data challenges. The effective melding of CNN or BiLSTM with external features heralds promising directions for future research, particularly in refining ncRNA classifiers and deepening insights into ncRNAs in cellular processes and disease manifestations. In addition to its original application in the context of bacterial organisms, the methodologies and techniques integrated into our framework can potentially render BioDeepFuse effective in various and broader domains. |
| format | Article |
| id | doaj-art-2725291de7df4245ba84e26d93cbc716 |
| institution | DOAJ |
| issn | 1547-6286 1555-8584 |
| language | English |
| publishDate | 2024-12-01 |
| publisher | Taylor & Francis Group |
| record_format | Article |
| series | RNA Biology |
| spelling | doaj-art-2725291de7df4245ba84e26d93cbc7162025-08-20T02:50:03ZengTaylor & Francis GroupRNA Biology1547-62861555-85842024-12-0121141042110.1080/15476286.2024.2329451BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classificationAnderson P. Avila Santos0Breno L. S. de Almeida1Robson P. Bonidia2Peter F. Stadler3Polonca Stefanic4Ines Mandic-Mulec5Ulisses Rocha6Danilo S. Sanches7André C.P.L.F. de Carvalho8Institute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilDepartment of Computer Science and Interdisciplinary Center of Bioinformatics, University of Leipzig, Leipzig, Saxony, GermanyDepartment of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, SloveniaDepartment of Food Science and Technology, Biotechnical Faculty, University of Ljubljana, Ljubljana, SloveniaDepartment of Applied Microbial Ecology, Helmholtz Centre for Environmental Research – UFZ GmbH, Leipzig, Saxony, GermanyDepartment of Computer Science, Federal University of Technology - Paraná, UTFPR, Cornélio Procópio, BrazilInstitute of Mathematics and Computer Sciences, University of São Paulo, São Carlos, BrazilThe accurate classification of non-coding RNA (ncRNA) sequences is pivotal for advanced non-coding genome annotation and analysis, a fundamental aspect of genomics that facilitates understanding of ncRNA functions and regulatory mechanisms in various biological processes. While traditional machine learning approaches have been employed for distinguishing ncRNA, these often necessitate extensive feature engineering. Recently, deep learning algorithms have provided advancements in ncRNA classification. This study presents BioDeepFuse, a hybrid deep learning framework integrating convolutional neural networks (CNN) or bidirectional long short-term memory (BiLSTM) networks with handcrafted features for enhanced accuracy. This framework employs a combination of k-mer one-hot, k-mer dictionary, and feature extraction techniques for input representation. Extracted features, when embedded into the deep network, enable optimal utilization of spatial and sequential nuances of ncRNA sequences. Using benchmark datasets and real-world RNA samples from bacterial organisms, we evaluated the performance of BioDeepFuse. Results exhibited high accuracy in ncRNA classification, underscoring the robustness of our tool in addressing complex ncRNA sequence data challenges. The effective melding of CNN or BiLSTM with external features heralds promising directions for future research, particularly in refining ncRNA classifiers and deepening insights into ncRNAs in cellular processes and disease manifestations. In addition to its original application in the context of bacterial organisms, the methodologies and techniques integrated into our framework can potentially render BioDeepFuse effective in various and broader domains.https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451Non-coding RNAdeep learningneural networksRNA identificationfeature extractionmodel performance |
| spellingShingle | Anderson P. Avila Santos Breno L. S. de Almeida Robson P. Bonidia Peter F. Stadler Polonca Stefanic Ines Mandic-Mulec Ulisses Rocha Danilo S. Sanches André C.P.L.F. de Carvalho BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification RNA Biology Non-coding RNA deep learning neural networks RNA identification feature extraction model performance |
| title | BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification |
| title_full | BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification |
| title_fullStr | BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification |
| title_full_unstemmed | BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification |
| title_short | BioDeepfuse: a hybrid deep learning approach with integrated feature extraction techniques for enhanced non-coding RNA classification |
| title_sort | biodeepfuse a hybrid deep learning approach with integrated feature extraction techniques for enhanced non coding rna classification |
| topic | Non-coding RNA deep learning neural networks RNA identification feature extraction model performance |
| url | https://www.tandfonline.com/doi/10.1080/15476286.2024.2329451 |
| work_keys_str_mv | AT andersonpavilasantos biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT brenolsdealmeida biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT robsonpbonidia biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT peterfstadler biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT poloncastefanic biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT inesmandicmulec biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT ulissesrocha biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT danilossanches biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification AT andrecplfdecarvalho biodeepfuseahybriddeeplearningapproachwithintegratedfeatureextractiontechniquesforenhancednoncodingrnaclassification |