A Difficulty-aware Approach to Fair Classification on Imbalanced Datasets

سال انتشار: 1405
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 20

فایل این مقاله در 18 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_CKE-9-1_001

تاریخ نمایه سازی: 5 خرداد 1405

چکیده مقاله:

Class imbalance in real-world datasets often biases standard classifiers toward the majority class, degrading performance on the minority class. While existing methods like sample re-weighting can mitigate this, they may increase overall misclassification errors or fail to consider the difficulty of training instances. To address these shortcomings, we introduce a difficulty-aware classification framework based on multi-objective evolutionary optimization. Our approach uses a specialized fitness function to simultaneously optimize for minority-class recall and overall accuracy, guiding the selection of the most informative training samples. We quantify sample difficulty using a fuzzy approach, which then modulate class-specific weights to refine the classifier's decision boundary. Furthermore, we incorporate chaotic dynamic maps into the evolutionary operators to accelerate convergence and maintain population diversity. Evaluated on various UCI benchmark datasets with ۱۰-fold cross-validation, our method improves minority-class performance on imbalanced data without compromising accuracy on balanced data. Comparative analysis using AUC, G-mean, and F-measure confirms our approach achieves a superior trade-off between minority-class detection and overall accuracy compared to state-of-the-art methods.

کلیدواژه ها:

Instance reduction ، Fuzzy weighted average distance-based decision surface ، Chaotic imperialist competitive algorithm ، Reduction rate

نویسندگان

Niloufar Kashefi

Computer Engineering, Faculty of computer engineering and information technology, Sadjad University, Mashhad, Iran

Javad Hamidzadeh

Computer Engineering, Faculty of computer engineering and information technology, Sadjad University, Mashhad, Iran

Mona Moradi

Computer Engineering, Faculty of computer engineering and information technology, Sadjad University, Mashhad, Iran

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • T. Xia, T. Dang, J. Han, L. Qendro, and C. ...
  • S. Tao, P. Peng, Y. Li, H. Sun, Q. Li, ...
  • A. Kumari, M. Tanveer, C. T. Lin, and Alzheimer’s Disease ...
  • Y. Pu, W. Yao, and X. Li. (۲۰۲۴). EM-IFCM: Fuzzy ...
  • Miraj, Mahabubur Rahman, et al. "GK-SMOTE: A Hyperparameter-Free Noise-Resilient Gaussian ...
  • Lee, Daeun, and Hyunjoong Kim. "Adaptive oversampling via density estimation ...
  • Taccaliti, Edoardo, and Jesus S. Aguilar–Ruiz. "Improving classification on imbalanced ...
  • C. Chen, An empirical study of adaptive kernel density estimation ...
  • B. Turkoglu. (۲۰۲۲). KDEBO: A kernel density estimation‑guided differential evolution‑based ...
  • F. Kamalov, S. Moussa, and J. A. Reyes. (۲۰۲۲). KDE‑based ...
  • T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. ...
  • J. Liu, F. Guo, H. Gao, Z. Huang, Y. Zhang, ...
  • H. He, Y. Bai, E. A. Garcia, and S. Li, ...
  • P. Viola and M. Jones, “Fast and robust classification using ...
  • نمایش کامل مراجع