N6-methyladenine identification using deep learning and discriminative feature integration
Abstract N6-methyladenine (6 mA) is a pivotal DNA modification that plays a crucial role in epigenetic regulation, gene expression, and various biological processes. With advancements in sequencing technologies and computational biology, there is an increasing focus on developing accurate methods fo...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
BMC
2025-03-01
|
| Series: | BMC Medical Genomics |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s12920-025-02131-6 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850065076974780416 |
|---|---|
| author | Salman Khan Islam Uddin Sumaiya Noor Salman A. AlQahtani Nijad Ahmad |
| author_facet | Salman Khan Islam Uddin Sumaiya Noor Salman A. AlQahtani Nijad Ahmad |
| author_sort | Salman Khan |
| collection | DOAJ |
| description | Abstract N6-methyladenine (6 mA) is a pivotal DNA modification that plays a crucial role in epigenetic regulation, gene expression, and various biological processes. With advancements in sequencing technologies and computational biology, there is an increasing focus on developing accurate methods for 6 mA site identification to enhance early detection and understand its biological significance. Despite the rapid progress of machine learning in bioinformatics, accurately detecting 6 mA sites remains a challenge due to the limited generalizability and efficiency of existing approaches. In this study, we present Deep-N6mA, a novel Deep Neural Network (DNN) model incorporating optimal hybrid features for precise 6 mA site identification. The proposed framework captures complex patterns from DNA sequences through a comprehensive feature extraction process, leveraging k-mer, Dinucleotide-based Cross Covariance (DCC), Trinucleotide-based Auto Covariance (TAC), Pseudo Single Nucleotide Composition (PseSNC), Pseudo Dinucleotide Composition (PseDNC), and Pseudo Trinucleotide Composition (PseTNC). To optimize computational efficiency and eliminate irrelevant or noisy features, an unsupervised Principal Component Analysis (PCA) algorithm is employed, ensuring the selection of the most informative features. A multilayer DNN serves as the classification algorithm to identify N6-methyladenine sites accurately. The robustness and generalizability of Deep-N6mA were rigorously validated using fivefold cross-validation on two benchmark datasets. Experimental results reveal that Deep-N6mA achieves an average accuracy of 97.70% on the F. vesca dataset and 95.75% on the R. chinensis dataset, outperforming existing methods by 4.12% and 4.55%, respectively. These findings underscore the effectiveness of Deep-N6mA as a reliable tool for early 6 mA site detection, contributing to epigenetic research and advancing the field of computational biology. |
| format | Article |
| id | doaj-art-cda97ccacc2440658631e952a34a736c |
| institution | DOAJ |
| issn | 1755-8794 |
| language | English |
| publishDate | 2025-03-01 |
| publisher | BMC |
| record_format | Article |
| series | BMC Medical Genomics |
| spelling | doaj-art-cda97ccacc2440658631e952a34a736c2025-08-20T02:49:06ZengBMCBMC Medical Genomics1755-87942025-03-0118111310.1186/s12920-025-02131-6N6-methyladenine identification using deep learning and discriminative feature integrationSalman Khan0Islam Uddin1Sumaiya Noor2Salman A. AlQahtani3Nijad Ahmad4Department of Computer Science, Abdul Wali Khan UniversityDepartment of Computer Science, Abdul Wali Khan UniversityBusiness and Management Sciences Department, Purdue UniversityDepartment of Computer Engineering, New Emerging Technologies and 5g Network and Beyond Research Chair, College of Computer and Information Sciences, King Saud UniversityDepartment of Computer Science, Khurasan UniversityAbstract N6-methyladenine (6 mA) is a pivotal DNA modification that plays a crucial role in epigenetic regulation, gene expression, and various biological processes. With advancements in sequencing technologies and computational biology, there is an increasing focus on developing accurate methods for 6 mA site identification to enhance early detection and understand its biological significance. Despite the rapid progress of machine learning in bioinformatics, accurately detecting 6 mA sites remains a challenge due to the limited generalizability and efficiency of existing approaches. In this study, we present Deep-N6mA, a novel Deep Neural Network (DNN) model incorporating optimal hybrid features for precise 6 mA site identification. The proposed framework captures complex patterns from DNA sequences through a comprehensive feature extraction process, leveraging k-mer, Dinucleotide-based Cross Covariance (DCC), Trinucleotide-based Auto Covariance (TAC), Pseudo Single Nucleotide Composition (PseSNC), Pseudo Dinucleotide Composition (PseDNC), and Pseudo Trinucleotide Composition (PseTNC). To optimize computational efficiency and eliminate irrelevant or noisy features, an unsupervised Principal Component Analysis (PCA) algorithm is employed, ensuring the selection of the most informative features. A multilayer DNN serves as the classification algorithm to identify N6-methyladenine sites accurately. The robustness and generalizability of Deep-N6mA were rigorously validated using fivefold cross-validation on two benchmark datasets. Experimental results reveal that Deep-N6mA achieves an average accuracy of 97.70% on the F. vesca dataset and 95.75% on the R. chinensis dataset, outperforming existing methods by 4.12% and 4.55%, respectively. These findings underscore the effectiveness of Deep-N6mA as a reliable tool for early 6 mA site detection, contributing to epigenetic research and advancing the field of computational biology.https://doi.org/10.1186/s12920-025-02131-6Deep LearningDNA ModificationsN6-methyladenine (6 mA)EpigeneticsSequence AnalysisDNA Methylation Detection |
| spellingShingle | Salman Khan Islam Uddin Sumaiya Noor Salman A. AlQahtani Nijad Ahmad N6-methyladenine identification using deep learning and discriminative feature integration BMC Medical Genomics Deep Learning DNA Modifications N6-methyladenine (6 mA) Epigenetics Sequence Analysis DNA Methylation Detection |
| title | N6-methyladenine identification using deep learning and discriminative feature integration |
| title_full | N6-methyladenine identification using deep learning and discriminative feature integration |
| title_fullStr | N6-methyladenine identification using deep learning and discriminative feature integration |
| title_full_unstemmed | N6-methyladenine identification using deep learning and discriminative feature integration |
| title_short | N6-methyladenine identification using deep learning and discriminative feature integration |
| title_sort | n6 methyladenine identification using deep learning and discriminative feature integration |
| topic | Deep Learning DNA Modifications N6-methyladenine (6 mA) Epigenetics Sequence Analysis DNA Methylation Detection |
| url | https://doi.org/10.1186/s12920-025-02131-6 |
| work_keys_str_mv | AT salmankhan n6methyladenineidentificationusingdeeplearninganddiscriminativefeatureintegration AT islamuddin n6methyladenineidentificationusingdeeplearninganddiscriminativefeatureintegration AT sumaiyanoor n6methyladenineidentificationusingdeeplearninganddiscriminativefeatureintegration AT salmanaalqahtani n6methyladenineidentificationusingdeeplearninganddiscriminativefeatureintegration AT nijadahmad n6methyladenineidentificationusingdeeplearninganddiscriminativefeatureintegration |