Comparing classification algorithms of data mining in diagnosis of diabetes and assessing the effectiveness of k-fold cross validation in the accuracy of the constructed model
محل انتشار: کنفرانس بین المللی مهندسی و علوم کامپیوتر
سال انتشار: 1395
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 606
فایل این مقاله در 6 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
ICCSE01_030
تاریخ نمایه سازی: 14 شهریور 1396
چکیده مقاله:
One of the applications of data mining is in medicine and model construction for disease diagnosis. The more the modellearns from previous data, the more accurate it would perform. The essential issue is that, the training and testing data in classificationof data must be selected in a way that the model enjoys the most efficient learning from previous data and the highest accuracy indiagnosis of the disease. In this study, the Pima dataset of diabetics is applied, the models for predicting and diagnosing diabetes aredeveloped based on KNN, SVM, Nave Bayesian and Decision Tree classification methods and the accuracy of each model is evaluated.The effectiveness of k-fold validation on the accuracy of each model is assessed. According to the findings here, k-fold cross validationincreases the model accuracy and a classification technique would not always have the best performance and accuracy, while it dependson the nature and complexity of the dataset. The simulation is made by the tool named RapidMiner.
کلیدواژه ها:
نویسندگان
Nasim Nikbakhsh
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran
GholamReza Dehghani
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran
Farsad Dr.Zamani
Department of computer, Isfahan (khorasgan) Branch Islamic Azad University Isfahan, Iran