Header menu link for other important links
X
A Distributed Tree-based Ensemble Learning Approach for Efficient Structure Prediction of Protein
Xavier L.D,
Published in The Intelligent Networks and Systems Society
2017
Volume: 10
   
Issue: 3
Pages: 226 - 234
Abstract
Knowledge of a protein's secondary structure, in turn, contributes to our understanding of the functions of the protein is vital to many aspects of living organisms such as those of enzymes, hormones, and structural material, etc. It also helps in designing new drugs for critical disease. In this paper, we have advocated a distributed approach to identify the Protein Secondary Structures using an ensemble method on protein primary sequences. The Ensemble based Random Forest algorithm has been adopted to build the three-way predictive model. Based on the amino acid features of each protein and decision tree parameters, the classification model allows us to assign protein structures as 'α helix', 'β sheet', or a coil. Also the proposed model is implemented in a distributed computing environment, SPARK. Experiments have been carried out using cross-validation tests on RS126 and CB513 benchmark datasets. Our results clearly confirm that ensemble approach in classifying protein secondary structures scores better accuracy with improved performance when it will be implemented in the distributed environment.
About the journal
JournalInternational Journal of Intelligent Engineering and Systems
PublisherThe Intelligent Networks and Systems Society
ISSN2185310X
Open Access0