Protein Structural Class Prediction via k-Separated Bigrams Using Position Specific Scoring Matrix
Harsh Saini*, Gaurav Raicar*, Alok Sharma*,**,
Sunil Lal*, Abdollah Dehzangi**, Rajeshkannan Ananthanarayanan*,
James Lyons**, Neela Biswas***, and Kuldip K. Paliwal**
*The University of the South Pacific, Fiji, Laucala Bay, Suva, Fiji
**Griffith University, Brisbane, Australia
***Royal Brisbane and Women’s Hospital, Brisbane, Australia
Protein structural class prediction (SCP) is as important task in identifying protein tertiary structure and protein functions. In this study, we propose a feature extraction technique to predict secondary structures. The technique utilizes bigram (of adjacent and k-separated amino acids) information derived from Position Specific Scoring Matrix (PSSM). The technique has shown promising results when evaluated on benchmarked Ding and Dubchak dataset.
Sunil Lal, Abdollah Dehzangi, Rajeshkannan Ananthanarayanan,
James Lyons, Neela Biswas, and Kuldip K. Paliwal, “Protein Structural Class Prediction via k-Separated Bigrams Using Position Specific Scoring Matrix,” J. Adv. Comput. Intell. Intell. Inform., Vol.18, No.4, pp. 474-479, 2014.
-  W. Chmielnicki, “A hybrid discriminative/generative approach to protein fold recognition,” Neurocomputing, Vol.75, No.1, pp. 194-198, 2012.
-  C.-W. Hsu and C.-J. Lin, “A comparison of methods for multiclass support vector machines,” IEEE Trans. on Neural Networks, Vol.13, No.2, pp. 415-425, 2002.
-  H.-B. Shen and K.-C. Chou, “Ensemble classifier for protein fold pattern recognition,” Bioinformatics, Vol.22, No.14, pp. 1717-1722, 2006.
-  M. Levitt and C. Chothia, “Structural patterns in globular proteins,” Nature, Vol.261, No.5561, pp. 552-558, 1976.
-  A. G. Murzin, S. E. Brenner, T. Hubbar, and C. Chothia, “Scop: A structural classification of proteins database for the investigation of sequences and structures,” J. of Molecular Biology, Vol.247, No.4, pp. 536.540, 1995.
-  G.-P. Zhou, “An intriguing controversy over protein structural class prediction,” J. of Protein Chemistry, Vol.17, No.8, pp. 729-738, 1998.
-  L. Kurgan and L. Homaeian, “Prediction of secondary protein structure content from primary sequence alone.a feature selection based approach,” Machine Learning and Data Mining in Pattern Recognition, pp. 334-345, Springer, 2005.
-  M. Mizianty and L. Kurgan, “Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences,” BMC Bioinformatics, Vol.10, No.1, p. 414, 2009.
-  A. Dehzangi and S. Karamizadeh, “Solving protein fold prediction problem using fusion of heterogeneous classifiers,” Information, Vol.14, No.11, pp. 3611-3621, 2011.
-  A. Dehzangi, K. Paliwal, A. Sharma, O. Dehzangi, and A. Sattar, “A Combination of Feature Extraction Methods with an Ensemble of Different Classifiers for Protein Structural Class Prediction Problem,” IEEE/ACM Trans. on Computational Biology and Bioinformatics, 2013.
-  C. Dubchak and I. Dubchak, “Multi-class protein fold recognition using support vector machines and neural networks,” Bioinformatics, Vol.17, No.4, pp. 349-358, 2001.
-  A. Sharma, K. K. Paliwal, A. Dehzangi, J. Lyons, S. Imoto, and S. Miyano, “A strategy to select suitable physicochemical attributes of amino acids for protein fold recognition,” BMC bioinformatics, Vol.14, No.1, pp. 233, 2013.
-  H. Zhang, T. Zhang, J. Gao, J. Ruan, S. Shen, and L. Kurgan, “Determination of protein folding kinetic types using sequence and predicted secondary structure and solvent accessibility,” Amino acids, Vol.42, No.1, pp. 271-283, 2012.
-  T. Liu and C. Jia, “A high-accuracy protein structural class prediction algorithm using predicted secondary structural information,” J. of Theoretical Biology, Vol.267, No.3, pp. 272-275, 2010.
-  T. Liu, X. Geng, X. Zheng, R. Li, and J.Wang, “Accurate prediction of protein structural class using auto covariance transformation of PSI-BLAST profiles,” Amino acids, Vol.42, No.6, pp. 2243-2249, 2012.
-  A. Sharma, J. Lyons, A. Dehzangi, and K. K. Paliwal, “A feature extraction technique using bi-gram probabilities of position specific scoring matrix for protein fold recognition,” J. of Theoretical Biology, 2012.
-  P. Klein, “Prediction of protein structural class by discriminant analysis,” Biochimica et Biophysica Acta (BBA) – Protein Structure and Molecular Enzymology, Vol.874, No.2, pp. 205-215, 1986.
-  Y.-S. Ding and T.-L. Zhang, “Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: an approach with immune genetic algorithm-based ensemble classifier,” Pattern Recognition Letters, Vol.29, No.13, pp. 1887-1892, 2008.
-  A. Chinnasamy, W.-K. Sung, and A. Mittal, “Protein structure and fold prediction using tree-augmented naive bayesian classifier,” J. of Bioinformatics and Computational Biology, Vol.3, No.4, pp. 803-819, 2005.
-  A. Anand, G. Pugalenthi, and P. N. Suganthan, “Predicting protein structural class by SVM with class-wise optimized features and decision probabilities,” J. of Theoretical Biology, Vol.253, No.2, pp. 375-380, 2008.
-  Y.-D. Cai, X.-J. Liu, X.-b. Xu, and K.-C. Chou, “Prediction of protein structural classes by support vector machines,” Computers & chemistry, Vol.26, No.3, pp. 293-296, 2002.
-  Y.-D. Cai and G.-P. Zhou, “Prediction of protein structural classes by neural network,” Biochimie, Vol.82, No.8, pp. 783-785, 2000.
-  S. Jahandideh, P. Abdolmaleki, M. Jahandideh, and E. B. Asadabadi, “Novel two-stage hybrid neural discriminant model for predicting proteins structural classes,” Biophysical Chemistry, Vol.128, No.1, pp. 87-93, 2007.
-  K. Chen, L. A. Kurgan, and J. Ruan, “Prediction of protein structural class using novel evolutionary collocation based sequence representation,” J. of Computational Chemistry, Vol.29, No.10, pp. 1596-1604, 2008.
-  L. A. Kurgan, T. Zhang, H. Zhang, S. Shen, and J. Ruan, “Secondary structure-based assignment of the protein structural classes,” Amino Acids, Vol.35, No.3, pp. 551-564, 2008.
-  L. Kurgan and K. Chen, “Prediction of protein structural class for the twilight zone sequences,” Biochemical and Biophysical Research Communications, Vol.357, No.2, pp. 453-460, 2007.
-  Q. Dong, S. Zhou, and J. Guan, “A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation,” Bioinformatics, Vol.25, No.20, pp. 2655-2662, 2009.
-  P. Ghanty and N. R. Pal, “Prediction of protein folds: Extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers,” IEEE Trans. on Nano Bioscience, Vol.8, No.1, pp. 100-110, 2009.