Identification of Protein Coding Regions in the Eukaryotic DNA Sequences Based on Marple Algorithm and Wavelet Packets Transform

The identification of protein coding regions (exons) plays a critical role in eukaryotic gene structure prediction. Many techniques have been introduced for discriminating between the exons and the introns in the eukaryotic DNA sequences, such as the discrete Fourier transform (DFT) based techniques...

Full description

Saved in:
Bibliographic Details
Main Authors: Guangchen Liu, Yihui Luan
Format: Article
Language:English
Published: Wiley 2014-01-01
Series:Abstract and Applied Analysis
Online Access:http://dx.doi.org/10.1155/2014/402567
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The identification of protein coding regions (exons) plays a critical role in eukaryotic gene structure prediction. Many techniques have been introduced for discriminating between the exons and the introns in the eukaryotic DNA sequences, such as the discrete Fourier transform (DFT) based techniques, but these DFT-based methods rapidly lose their effectiveness in the case of short DNA sequences. In this paper, a novel integrated algorithm based on autoregressive spectrum analysis and wavelet packets transform is presented to improve the efficiency and accuracy of the coding regions identification. The experimental results show that the new algorithm outperforms the conventional DFT-based approaches in improving the prediction accuracy of protein coding regions distinctly by testing GENSCAN65, HMR195, and BG570 benchmark datasets.
ISSN:1085-3375
1687-0409