CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

An Enhanced SMOTE Algorithm Using Entropy and Clustering for Imbalanced Accident Data

عنوان مقاله: An Enhanced SMOTE Algorithm Using Entropy and Clustering for Imbalanced Accident Data
شناسه ملی مقاله: CITCONF02_513
منتشر شده در دومین همایش ملی پژوهش های کاربردی در علوم کامپیوتر و فناوری اطلاعات در سال 1393
مشخصات نویسندگان مقاله:

Sima Sharifirad - Master student of computer science, AmirKabir University
Azra Nazari - Graduate student of master of computer science, AmirKabir University
Mahdi Ghatee - Assistant professor of computer science, AmirKabir University

خلاصه مقاله:
Over the course of the century, many real-world applications of imbalanced data are emerged. One of its implication which is first considered in this context, is imbalanced accident data. In this paper, the data of transportation and accidents in Tehran-Bazargan highway between 2010 and 2015 is considered. In the pre-processing step, SMOTE is considered as one of the most important over-sampling technique that effectively balance the imbalanced data. However, it brings noise and other problems and a great need is felt for improving this method. To solve these problems, several techniques have been proposed in this study such as combination of dynamic selected, weighted attribute and distance weighted techniques along with mixture of classification and clustering techniques. Performance of the proposed algorithm is measured by f-measure and ROC curve and the results are compared by Weka’s SMOTE with different algorithms.

کلمات کلیدی:
imbalanced data, SMOTE, f-measure, ROC curve

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/455383/