Predicting the membrane proteins’ classification using multi-dimensional wavelet and random forest classifier

سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 94

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

IBIS10_036

تاریخ نمایه سازی: 5 تیر 1401

چکیده مقاله:

Concerning the difficulties and complexity of experimental methods to determine the functionality andstructure of the proteins, the computational techniques have recently found their proper place in predictingprotein function problems. While different techniques have been introduced based on the machine learningapproach, there is no combination technique of exploiting Multidimensional discrete wavelettransform(DWT) analysis and machine learning. In this study, we have devised a handy, accurate, and timeefficientpredictive model to classify the membrane proteins into five different classes, including single-passtype ۱, single-pass type ۲, Multi-Pass, Lipid-Chain, and GPI membrane proteins based on DWT analysis andmachine learning approach.We have applied our proposed method for Chou's membrane protein datasets, containing ۲۰۵۹ and ۲۶۲۵membrane protein sequences from five different classes. The majority of the former studies used thesedatasets as the complete ones. In this technique, protein sequences were initially transformed into sixdimensionalsignals, including the hydropathy scale, polarity, secondary structure, molecular volume, codondiversity, and electrostatic charge indexes. These six-dimensional signals are then used as themultidimensional discrete wavelet transform input data to analyze the entire signals. Feature vectors werethen generated regarding the proper criteria of approximate and detailed coefficients for every single protein.ventually, the feature vectors were used in a random forest classifier to avoid overfitting and take advantageof measuring variable importance.As a result, we obtained an accuracy of ۹۱.۷% and ۸۹.۶% for the independent dataset and jackknife test,respectively. These results indicated that the proposed method yielded better results.

نویسندگان

Parham Hajishafizahramini

Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal AleAhmad,Nasr, Tehran, Iran

Parviz Abdolmaleki

Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Jalal AleAhmad,Nasr, Tehran, Iran