Cancer Prediction through Advanced Gene Expression Analysis Using Machine Learning Method

  • سال انتشار: 1402
  • محل انتشار: دوازدهمین همایش ملی و سومین همایش بین المللی بیوانفورماتیک
  • کد COI اختصاصی: IBIS12_161
  • زبان مقاله: انگلیسی
  • تعداد مشاهده: 34
دانلود فایل این مقاله

نویسندگان

Zahra Gfarokhi

Electrical and computer Engineering Department, University of Science and Technology of Mazandaran, Behshahr, Iran

Seydeh Zahra Ahmadi

Electrical and computer Engineering Department, University of Science and Technology of Mazandaran, Behshahr, Iran

Jamshid Pirgazi

Electrical and computer Engineering Department, University of Science and Technology of Mazandaran, Behshahr, Iran

چکیده

Cancer is a term used to describe a group of diseases where cells grow abnormally and canspread to different parts of the body. According to the World Health Organization (WHO), cancer is thesecond most common cause of death globally. Detecting cancer is challenging due to the complexity ofearly symptoms, which can be subtle or variable, necessitating better tools for accurate diagnosis.One of the approaches for cancer diagnosis designed on RNA sequence gene expression, howeverunderstanding gene expression in cancer is also challenging; it involves decoding complex moleculardetails in how cancer develops, requiring advanced techniques [۱].This study investigates gene expression as a crucial path for identifying cancer biomarkers, utilizingsophisticated molecular techniques such as DNA microarrays and RNA sequencing. We worked on thegene expression RNA seq dataset from the UCI repository [۲], which includes five distinct cancer typesLUAD, BRCA, KIRC, LUSC, and COAD. Dataset sample number and dimensionIn order to reduce the high dimension of the input dataset we carefully applied feature selectiontechnique i.e. Mutual information, and Linear Discriminant Analysis (LDA) to extract distinct featuresfrom RNA sequencing data. Subsequently, these discerned features serve as inputs into diverse machinelearning classifiers, including decision trees, k-nearest neighbors (KNN), support vector machines(SVM), and naive Bayes, facilitating the discrimination and classification of specific cancer types.Notably, the study highlights the superior performance of the proposed classifier, achieving animpressive accuracy rate of ۹۹.۸۹% compared to existing approaches [۳]. This underscores itscontribute to cancer studies by emphasizing the crucial role of gene expression analysis and theintegration of new techniques in machine learning. Potential as a robust tool in the realm of cancerclassification.

کلیدواژه ها

Cancer diagnosis, Gene expression, RNA-Sequence, Feature selection, Classification

مقالات مرتبط جدید

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.