NSO: Natural Selection Optimization for Adaptive k-Nearest Neighbor Imputation

سال انتشار: 1403
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 130

فایل این مقاله در 7 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ICAIFT02_028

تاریخ نمایه سازی: 6 خرداد 1404

چکیده مقاله:

This study introduces Natural Selection Optimization (NSO), a novel approach for optimizing parameters in k-Nearest Neighbor (kNN) imputation methods, addressing the critical challenge of missing data in machine learning tasks. While kNN imputation is effective, its performance heavily depends on parameter selection, particularly the number of neighbors (k) and distance metrics. Traditional optimization techniques often struggle with scalability and adaptability in high-dimensional datasets. NSO, inspired by biological evolution, treats the parameter space as an evolving ecosystem. Parameter combinations compete for survival based on imputation accuracy and resulting classification performance. Through iterative generations, NSO dynamically tunes parameters in real-time, adapting to dataset characteristics. Our experimental results demonstrate NSO's superiority across diverse datasets. In the MIMIC-III medical dataset, NSO improved classification accuracy from ۸۱% to ۸۷% compared to traditional methods. For the LendingClub financial dataset, it enhanced the F۱-score from ۰.۷۴ to ۰.۸۲. Notably, in high-dimensional spaces (۱۰,۰۰۰ dimensions), NSO maintained a robust F۱-score of ۰.۸۱, significantly outperforming grid search methods which achieved only ۰.۶۵. NSO showed particular strength in handling mixed data types and varying missingness patterns (۵% to ۳۰%). Statistical tests confirmed the significance of these improvements (p < ۰.۰۱), with medium to large effect sizes (Cohen's d > ۰.۶, Cliff's delta > ۰.۴). This research not only enhances kNN imputation but also opens new avenues for nature-inspired algorithms in data preprocessing tasks. NSO's adaptive approach promises to improve the reliability of AI systems in data-sparse environments, with potential impacts across healthcare, finance, and other critical domains.

نویسندگان

AmirKeivan Shafiei

Department of Computer Engineering, University of Birjand, Birjand, Iran

Seyed Mojtaba Mosavi Nezhad

Civil Engineering Department, University of Birjand, Birjand, Iran