Application of machine learning in building a diagnostic model for gastric cancer based on a survival-related competitive endogenous RNA (ceRNA) network
محل انتشار: اولین کنگره بین المللی هوش مصنوعی در علوم پزشکی
سال انتشار: 1402
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 167
نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
AIMS01_029
تاریخ نمایه سازی: 1 مرداد 1402
چکیده مقاله:
Background and aims: Gastric cancer (GC) is known as a highly aggressive malignancy inwhich environmental and genetic factors can influence its development. Among the genetic factors,competitive endogenous (ce) RNAs are identified to affect the development of cancer. Theaim of this study was to find diagnostic biomarkers for GC based on a ceRNA network by utilizingmachine learning approaches.Methods: The RNA-seq and clinical data of ۳۳۵ GC tumor and ۳۰ non-tumor samples weredownloaded using TCGAbiolinks R-package. Differentially-expressed long non-coding RNAs(lncRNAs) (DELs), miRNAs (DEmiRs), and mRNAs (DEMs) were extracted by R-packageDESeq۲ based on |Log۲ fold change|>۱ and adjusted p<۰.۰۵. Utilizing univariate Cox regression,those DELs, DEmiRs, and DEMs which were survival-related were detected with a threshold ofp<۰.۰۵. The multiMiR R-package and DIANA-LncBase v۳.۰ were used to predict the miRNA–mRNA and miRNA–lncRNA interactions. A lncRNA-miRNA-mRNA ceRNA network was thenconstructed. Using lncRNAs of the network, machine learning analysis were conducted. First,the data was split into training and test with a ratio of ۰.۷ to ۰.۳ and then tsamples in the traininggroup were resampled using SMOTETomek method. Recursive Feature Elimination (RFE) methodwas used as the feature selection technique and the selected features were utilized to build adiagnostic model utilizing support vector machine (SVM) algorithm.Results: ۳۹۴۷ DELs, ۲۶۶ DEmiRs, and ۴۳۸۸ DEMs were detected in differential expression analysisbetween tumor and non-tumor GC samples which among them, ۱۸۷ DELs, ۲۴ DEmiRs, and۵۲۴ DEMs were associated with the overall survival of GC patients. By integrating the relationswith common miRNAs, we constructed a ceRNA network consisting of ۱۲ DELs, ۱۱ DEmiRs,and ۷۰ mRNAs. After balancing the training cohort and by using RFE, four lncRNAs were selected(ENSG۰۰۰۰۰۲۱۳۲۷۹, ENSG۰۰۰۰۰۲۴۸۱۰۳, ENSG۰۰۰۰۰۲۴۹۰۰۱ and ENSG۰۰۰۰۰۲۶۲۰۶۱) asfinal diagnostic signature. A SVM diagnostic model was then constructed with an area under thecurve (AUC) of ۰.۹۸ in the test group.Conclusion: In this study, using ceRNA network construction and machine learning analysis, weidentified four diagnostic lncRNAs for GC patients which were survival-related as well. Sincemachine learning approaches are powerful methods to introduce biomarkers, our future effortswill be focused on the experimental and clinical validation of these biomarkers.
کلیدواژه ها:
نویسندگان
Maryam Hosseini
Department of Genetics and Molecular Biology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
Basireh Bahrami
Department of Genetics and Molecular Biology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
ParvanehNikpour
Department of Genetics and Molecular Biology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran