Ensemble based variational autoencoders for detecting protein complexes in proteinprotein interaction networks

سال انتشار: 1402
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 73

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

IBIS12_079

تاریخ نمایه سازی: 12 آبان 1403

چکیده مقاله:

Protein-protein interaction (PPI) networks are composed of multiple protein complexeswhich play the essential roles in many biological functions and identifying different forms of a disease.Each protein complex is a group of some proteins interacting with each other. Nowadays, due to thelimitations of experimental methods, computational approaches are used to identify the complexes. Inthis regard, measurement errors lead to the noisy and uncertain interactions, which makes it difficult toobtain reliable clusters. To face the challenge, a new method based on Ensemble VariationalAutoencoders named EVA is proposed in this study, that benefits from deep embedding and consensusclustering together to deal with the uncertainty. Using variational autoencoder, it is possible to filter thenoise by creating meaningful representations of the proteins and extracting important features of cocomplexones. In addition, the ensemble learning approach integrate multiple deep models to seek betterembeddings of the proteins and lead to the more qualitative clustering of PPI networks.In this regard, a similarity matrix is generated first using second-order proximity of pairwise proteins.Then, several varioational autoencoders are trained to embed the data points into the low dimensionalfeature space. Next, the resulting representations of each network are extracted and clusteredindependently. Finally, the base clusterings are combined to obtain a robust reliable complexes of theproteins. The proposed method was evaluated by four real datasets of PPI networks in different densityand dimensions including Krogan-core, Krogan-extended, Collins and Gavin. According to the resultsof F-score and MCC (Matthews’s correlation coefficient) evaluation metrics, the proposed methodachieved significant efficiency compared to the recent clustering methods of protein interactionnetworks.

نویسندگان

Malihe Danesh

Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran

Parisa Rostami

Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran