Feature selection via mixed-integer program and supervised infinite feature selection method
محل انتشار: مجله مدلسازی ریاضی، دوره: 13، شماره: 2
سال انتشار: 1404
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 57
فایل این مقاله در 16 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JMMO-13-2_007
تاریخ نمایه سازی: 13 خرداد 1404
چکیده مقاله:
Feature selection is an important step in data preprocessing, which helps reducing the dimensionality of data and simplifying the models. This process not only reduces the computational complexity of models, but also improves their accuracy by eliminating irrelevant features and noise. The three most widely used approaches for feature selection are filter, wrapper and embedded methods. In this paper, first we review some support vector machine based Mixed-Integer Linear Programming (MILP) models and Supervised Infinite Feature Selection (Inf-FS_s) method. Then, we propose three hybrid approaches based on them. The first approach involves solving the relaxed linear model of the underlying MILP model and then solving the MILP model for those features with nonzero weights, namely a smaller MILP. In the second approach, first the Inf-FS_s method is applied to rank the features. Then depending on the features costs, either chooses the top features from the ranked features until budget parameter is reached or solves a knapsack problem to select cost effective features. The third approach applies the first approach to the top ۲۰\% of features ranked by Inf-FS_s method. To evaluate the proposed approaches' performance, experiments are conducted on four high-dimensional benchmark datasets for fixed and random features costs. Results demonstrate that using either of the proposed approaches can significantly reduce running time of MILP models with comparable accuracies with the original MILP models.
کلیدواژه ها:
نویسندگان
Mohammad Noroozi
Department of Applied Mathematics, Faculty of Mathematical Sciences, University of Guilan, Rasht, Iran
Maziar Salahi
Department of Applied Mathematics, Faculty of Mathematical Sciences, University of Guilan, Rasht, Iran
Sadegh Eskandari
Department of Computer Science, Faculty of Mathematical Sciences, University of Guilan, Rasht, Iran