Header menu link for other important links
X
A new numerical approach for DNA representation using modified Gabor wavelet transform for the identification of protein coding regions
M. Raman Kumar,
Published in Elsevier Sp. z o.o.
2020
Volume: 40
   
Issue: 2
Pages: 836 - 848
Abstract
The fundamental step in genomic signal processing applications is to assign mathematical descriptor to nucleotides {A, T, G, C} of DNA molecule for discrete representation. The discrete representation should replicate biological information of gene when analyzed with digital signal processing tools. In this aspect, a novel binary representation of DNA sequence by combining structural and chemical information of original DNA sequence has been proposed for the identification of protein coding regions of eukaryotes. The identification model comprises two stages, mainly, numerical encoding in first stage, and analysis of biological behavior through digital signal processing algorithms in second stage. In the first stage, a new numerical encoding method based on Walsh codes of order-4 is proposed to obtain 1-D binary discrete sequence. In the second stage, the modified Gabor wavelet transform (MGWT) is employed on the discretized DNA sequence for spectrum analysis. The optimal gene numerical encoding and multiresolution approach of MGWT has readily identified the structures of coding regions of unknown gene sequences. The proposed model is validated by analyzing prediction efficiency in terms of statistical metrics such as sensitivity, specificity, accuracy on both sequence and data base level. Furthermore, the results are compared by plotting receiver operating curves (ROC) for all classification thresholds for the state-of-art encoding methods. Area under curve (AUC) value of 0.86 at sequence level and 0.84 at database level is achieved. Performance metrics indicate that the proposed encoding method exhibits relatively better performance than other numerical encoding methods. © 2020 Nalecz Institute of Biocybernetics and Biomedical Engineering of the Polish Academy of Sciences
About the journal
JournalData powered by TypesetBiocybernetics and Biomedical Engineering
PublisherData powered by TypesetElsevier Sp. z o.o.
ISSN02085216