Combination of Genetic Programming and Support Vector Machine-Based Prediction of Protein-Peptide Binding Sites With Sequence and Structure-Based Features

  • سال انتشار: 1400
  • محل انتشار: مجله محاسبات و امنیت، دوره: 8، شماره: 1
  • کد COI اختصاصی: JR_JCSE-8-1_005
  • زبان مقاله: انگلیسی
  • تعداد مشاهده: 263
دانلود فایل این مقاله

نویسندگان

Shima shafiee

Department of Computer Engineering and Information Technology, Razi University, Kermanshah, Iran.

Abdolhossein Fathi

Department of Computer Engineering and Information Technology, Razi University, Kermanshah, Iran.

چکیده

Prediction of the peptide-binding site of proteins is a significant and essential task in different processes such as understanding biological processes, protein functional analysis, comparison of functional sites, comprehension of the transactions mechanism, drug design, cellular signaling, and cancer treatment. Predictive analysis of the protein-peptide binding site is one of the most challenging bioinformatics issues. Experimental methods are time-consuming, costly, and laborious. Therefore, we propose a machine learning-based method for predicting protein-peptide binding sites by utilizing enhanced features vector obtained from three-dimensional protein structure and one-dimensional sequence string data. To this end, the genetic programming technique is applied to the obtained basic features extract a more discriminative feature vector. Then support vector machine is employed to determine the binding residue of each amino acid. Finally, the binding sites are predicted using the structure clustering algorithm on the obtained binding residues. The proposed method was evaluated on the Bio Lip dataset. The prediction rate of ۹۲.۷۶% and ۹۳.۰۹% were achieved when ۱۰-fold cross-validation and independent test set respectively used. The acquired results were compared to the performance of other state-of-the-art methods. The proposed method achieves robust and consistent performance using sequence-based and structure-based features for both ۱۰-fold cross-validation and independent tests.

کلیدواژه ها

Binding Site, Genetic Programming, Structure Clustering Algorithm, Protein-Peptide Binding Prediction

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.