Comparison of machine learning algorithms in the diagnosis of pancreatic ductal adenocarcinoma through urinary biomarkers
محل انتشار: اولین کنگره بین المللی هوش مصنوعی در علوم پزشکی
سال انتشار: 1402
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 211
نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
AIMS01_082
تاریخ نمایه سازی: 1 مرداد 1402
چکیده مقاله:
Background and aims: Rapid diagnosis of cancer and its prevention is one of the goals of modernmedicine, which can be reached today with the presence of biomarkers and artificial intelligence.One of the cancers whose early detection can reduce the mortality rate is pancreatic ductaladenocarcinoma. In the studies conducted in this field, it was found that the biomarkers tff۱,reg۱b and lyve۱ change in the urine level in this disease. Now, with the existence of sciencessuch as artificial intelligence and machine learning, it is possible to diagnose cancer faster withthe information of biomarkers.Method: The data used in this study is related to the data collected from urinary biomarkers inthree categories of healthy people, people with pancreatic disease, and people with pancreaticductal adenocarcinoma, which was collected by Debernardi et al. and is available on the Kagglewebsite.Python programming language and machine learning algorithms such as svm, mlp,logistics regression, random forest, etc. have been used to model this data for disease diagnosis.Accuracy and kappa score are used to measure the accuracy of algorithms. The cross validationwas also done, all of which are available in the sklearn package, one of the Python librariesResults: According to the initial data analysis, this data included ۵۹۰ samples, which containedthe information of plasma_CA۱۹_۹, creatinine, LYVE۱, REG۱B, TFF۱, REG۱A and et. Somecolumns of this dataset had empty data, and to deal with this problem, there are two strategies,the first strategy is to remove empty data, create data to balance with the over sampling method,and the other strategy is linear interpolation of empty data, which was done for both modelingstrategies to ensure accuracy. After that, features were selected with correlation and (SFS) wrappermethods, and it was also found that REG۱B and LYVE۱ have the most impact in the correctdiagnosis of cancer, but all the biomarkers in this dataset are related to the algorithms Machinelearning was given. After modeling by various algorithms, the highest accuracy was determinedin the two considered strategies, respectively, Random Forest with cross validation with an accuracycriterion of ۹۶% and MLP with a cross validation accuracy criterion of ۷۳%.Conclusion: Rapid diagnosis of cancer can be effective in preventing many of its damages andreducing the death rate, which is possible with the help of biomarkers and advances in artificialintelligence and machine learning. This study showed that cancer can be diagnosed with highaccuracy with the help of biomarkers and this science, and the highest accuracy recorded in thisstudy was ۹۶% in cross-validation, which is related to the random forest algorithm, although it islow due to the number of for example, the results of this study cannot be trusted and more studiesare needed to be more confident. It is hoped that this disease will be controlled in the best wayin the future.
کلیدواژه ها:
نویسندگان
E Saeedi tazekad
Sarab University of Medical Sciences, Iran
S Ebrahimi
Mazandaran University of Medical Sciences, Iran