Providing an efficient method based on machine learning for classifying imbalanced datasets

سال انتشار: 1397
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 474

فایل این مقاله در 8 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

KTCONG01_002

تاریخ نمایه سازی: 21 خرداد 1398

چکیده مقاله:

One of the most important issues in data mining is classifying imbalanced datasets. In many supervised learning applications, there is a significant difference between the prior probabilities of different classes, such as between the probabilities with which an example belongs to the different classes of the classification problem. This situation is known as the class imbalance problem (Chawla et al, 2004). Although existing knowledge discovery and data engineering techniques have shown great success in many real-world applications, the problem of learning from imbalanced data (the imbalanced learning problem) is a relatively new challenge that has attracted growing attention from both academia and industry. The imbalanced learning problem is concerned with the performance of learning algorithms in the presence of underrepresented data and severe class distribution skews (Garcia et al, 2009). The term imbalanced dataset is generally referred to a dataset that has many differences in the number of instances in various classes (Wang and Yao, 2009). Traditional classification methods do not act well on imbalanced data in order to minimize overall errors, since they generally assume that the distribution of classes is balanced. This issue is very important and is considered as a challenging issue. In this work, the data is classified according to the Bagging algorithm, which uses the C4.5 Cost- Sensitive Random Tree as a single classifier. The imperialist competitive algorithm has also been used to determine the cost of misclassify classes in order to construct a cost-sensitive tree.

کلیدواژه ها:

نویسندگان

Mostafa Boroumandzadeh

Department of Computer Engineering and Information Technology, Payame Noor University, IR. Iran