Protein Structural Class Prediction via k-Separated Bigrams Using Position Specific Scoring Matrix
Harsh Saini*, Gaurav Raicar*, Alok Sharma*,**,
Sunil Lal*, Abdollah Dehzangi**, Rajeshkannan Ananthanarayanan*,
James Lyons**, Neela Biswas***, and Kuldip K. Paliwal**
*The University of the South Pacific, Fiji, Laucala Bay, Suva, Fiji
**Griffith University, Brisbane, Australia
***Royal Brisbane and Women’s Hospital, Brisbane, Australia
Protein structural class prediction (SCP) is as important task in identifying protein tertiary structure and protein functions. In this study, we propose a feature extraction technique to predict secondary structures. The technique utilizes bigram (of adjacent and k-separated amino acids) information derived from Position Specific Scoring Matrix (PSSM). The technique has shown promising results when evaluated on benchmarked Ding and Dubchak dataset.
-  W. Chmielnicki, “A hybrid discriminative/generative approach to protein fold recognition,” Neurocomputing, Vol.75, No.1, pp. 194-198, 2012.
-  C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. on Neural Networks, Vol.13, No.2, pp. 415-425, 2002.
-  H.-B. Shen and K.-C. Chou, “Ensemble classifier for protein fold pattern recognition,” Bioinformatics, Vol.22, No.14, pp. 1717-1722, 2006.
-  M. Levitt and C. Chothia, “Structural patterns in globular proteins,” Nature, Vol.261, No.5561, pp. 552-558, 1976.
-  A. G. Murzin, S. E. Brenner, T. Hubbar, and C. Chothia, “Scop: A structural classification of proteins database for the investigation of sequences and structures,” J. of Molecular Biology, Vol.247, No.4, pp. 536.540, 1995.
-  G.-P. Zhou, “An intriguing controversy over protein structural class prediction,” J. of Protein Chemistry, Vol.17, No.8, pp. 729-738, 1998.
-  L. Kurgan and L. Homaeian, “Prediction of secondary protein structure content from primary sequence alone.a feature selection based approach,” Machine Learning and Data Mining in Pattern Recognition, pp. 334-345, Springer, 2005.
-  M. Mizianty and L. Kurgan, “Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences,” BMC Bioinformatics, Vol.10, No.1, p. 414, 2009.
-  A. Dehzangi and S. Karamizadeh, “Solving protein fold prediction problem using fusion of heterogeneous classifiers,” Information, Vol.14, No.11, pp. 3611-3621, 2011.
-  A. Dehzangi, K. Paliwal, A. Sharma, O. Dehzangi, and A. Sattar, “A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem,” IEEE/ACM Trans. on Computational Biology and Bioinformatics, 2013.
-  C. Dubchak and I. Dubchak, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, Vol.17, No.4, pp. 349-358, 2001.
-  A. Sharma, K. K. Paliwal, A. Dehzangi, J. Lyons, S. Imoto, and S. Miyano, “A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition,” BMC bioinformatics, Vol.14, No.1, pp. 233, 2013.
-  H. Zhang, T. Zhang, J. Gao, J. Ruan, S. Shen, and L. Kurgan, “Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility,” Amino acids, Vol.42, No.1, pp. 271-283, 2012.
-  T. Liu and C. Jia, “A high-accuracy protein structural class prediction algorithm using predicted secondary structural information,” J. of Theoretical Biology, Vol.267, No.3, pp. 272-275, 2010.
-  T. Liu, X. Geng, X. Zheng, R. Li, and J.Wang, “Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles,” Amino acids, Vol.42, No.6, pp. 2243-2249, 2012.
-  A. Sharma, J. Lyons, A. Dehzangi, and K. K. Paliwal, “A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition,” J. of Theoretical Biology, 2012.
-  P. Klein, “Prediction of protein structural class by discriminant analysis,” Biochimica et Biophysica Acta (BBA) – Protein Structure and Molecular Enzymology, Vol.874, No.2, pp. 205-215, 1986.
-  Y.-S. Ding and T.-L. Zhang, “Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier,” Pattern Recognition Letters, Vol.29, No.13, pp. 1887-1892, 2008.
-  A. Chinnasamy, W.-K. Sung, and A. Mittal, “Protein structure and fold prediction using tree-augmented naive bayesian classifier,” J. of Bioinformatics and Computational Biology, Vol.3, No.4, pp. 803-819, 2005.
-  A. Anand, G. Pugalenthi, and P. N. Suganthan, “Predicting protein structural class by SVM with class-wise optimized features and decision probabilities,” J. of Theoretical Biology, Vol.253, No.2, pp. 375-380, 2008.
-  Y.-D. Cai, X.-J. Liu, X.-b. Xu, and K.-C. Chou, “Prediction of protein structural classes by support vector machines,” Computers & chemistry, Vol.26, No.3, pp. 293-296, 2002.
-  Y.-D. Cai and G.-P. Zhou, “Prediction of protein structural classes by neural network,” Biochimie, Vol.82, No.8, pp. 783-785, 2000.
-  S. Jahandideh, P. Abdolmaleki, M. Jahandideh, and E. B. Asadabadi, “Novel two-stage hybrid neural discriminant model for predicting proteins structural classes,” Biophysical Chemistry, Vol.128, No.1, pp. 87-93, 2007.
-  K. Chen, L. A. Kurgan, and J. Ruan, “Prediction of protein structural class using novel evolutionary collocation based sequence representation,” J. of Computational Chemistry, Vol.29, No.10, pp. 1596-1604, 2008.
-  L. A. Kurgan, T. Zhang, H. Zhang, S. Shen, and J. Ruan, “Secondary structure-based assignment of the protein structural classes,” Amino Acids, Vol.35, No.3, pp. 551-564, 2008.
-  L. Kurgan and K. Chen, “Prediction of protein structural class for the twilight zone sequences,” Biochemical and Biophysical Research Communications, Vol.357, No.2, pp. 453-460, 2007.
-  Q. Dong, S. Zhou, and J. Guan, “A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation,” Bioinformatics, Vol.25, No.20, pp. 2655-2662, 2009.
-  P. Ghanty and N. R. Pal, “Prediction of protein folds: Extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers,” IEEE Trans. on Nano Bioscience, Vol.8, No.1, pp. 100-110, 2009.