De novo design of Antibacterial peptides by ensemble machine learning methods

سال انتشار: 1399
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 138

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

BIOCONF21_0696

تاریخ نمایه سازی: 7 شهریور 1400

چکیده مقاله:

Antibiotic resistance is a great challenge. Since Antimicrobial peptides directly act on the microbial membrane and normally didn’t have any specific protein targets, it is less likely, bacteria arise resistance against these molecules. Recently statistical analysis and machine learning algorithms have been considered. Ensemble learning techniques, in machine learning, are a combination of several models that are used to provide an optimal model for predicting or classifying data. The most widely used algorithms are Bagging, Adaboost and RandomForest with several estimators. In this study, to predict peptides with specific antibacterial effects, the data has been gathered from the DRAMP۲.۰, EDA were performed with the Seaborn, Numpy, and Pandas packages in Python. ۵۵۴ peptides with antibacterial function and ۶۲۶ without it were provided. Descriptors have been defined based on biophysical features like length, Molecular weight, Charge, Charge density, pI, Instability index, Aromaticity, Aliphatic index, Boman index, and Hydrophobic ratio. Modeling was performed using an SVM algorithm with linear, polynomial (degree=۵) and RDF (gamma=۳) kernel functons, RandomForest algorithm, Bagging classifier and Adaboost with ۱۰۰ and ۱۰۰۰ estimators. The accuracy and precision of the model made using the RandomForest algorithm with ۱۰۰۰ estimators was ۸۷% and ۹۰% and this model was the most optimal compared to other methods. The average of accuracy and precision for SVM method with mentioned kernels,Bagging and Adaboost was ۷۸%,۸۷% and ۸۶%, respectively. For the data and features of this study, the ensemble technique had better results than the SVM method due to the way the train data is used, the data is randomly segmented and used several times to learn the model. Despite the advancement of computational methods in drug design and therapeutic peptides, there is still a need for laboratory methods for more accurate evaluations, which is one of the next steps in this research.

نویسندگان

Fatemeh Ebrahimi Tarki

omputational biology laboratory, Biotechnology department, Faculty of biological science, Alzahra University

Somayyeh Dabbagh Sadeghpour

omputational biology laboratory, Biotechnology department, Faculty of biological science, Alzahra University

Mahboobeh Zarrabi

omputational biology laboratory, Biotechnology department, Faculty of biological science, Alzahra University