Feature selection method based on clustering technique and optimization algorithm

  • سال انتشار: 1403
  • محل انتشار: مجله آنالیز غیر خطی و کاربردها، دوره: 15، شماره: 9
  • کد COI اختصاصی: JR_IJNAA-15-9_021
  • زبان مقاله: انگلیسی
  • تعداد مشاهده: 164
دانلود فایل این مقاله

نویسندگان

Sara Dehghani

Department of Computer Engineering, Yasuj Branch, Islamic Azad University, Yasuj, Iran

Razieh Mlekhosseini

Department of Computer Engineering, Yasuj Branch, Islamic Azad University, Yasuj, Iran

Karamollah Bagherifard

Department of Computer Engineering, Yasuj Branch, Islamic Azad University, Yasuj, Iran

S. Hadi Yaghoubian

Department of Computer Engineering, Yasuj Branch, Islamic Azad University, Yasuj, Iran

چکیده

Data platforms with large dimensions, despite the opportunities they create, create many computational challenges. One of the problems of data with large dimensions is that most of the time, all the characteristics of the data are not important and vital to finding the knowledge that is hidden in them. These features can have a negative effect on the performance of the classification system. An important technique to overcome this problem is feature selection. During the feature selection process, a subset of primary features is selected by removing irrelevant and redundant features. In this article, a hierarchical algorithm based on the coverage solution will be presented, which selects effective features by using relationships between features and clustering techniques. This new method is named GCPSO, which is based on the optimization algorithm and selects the appropriate features by using the feature clustering technique. The feature clustering method presented in this article is different from previous algorithms. In this method, instead of using traditional clustering models, final clusters are formed by using the graphic structure of features and relationships between features. The UCI database has been used to evaluate the proposed method due to its extensive characteristics. The efficiency of the proposed model has also been compared with the feature selection methods based on the coverage solution that uses evolutionary algorithms in the feature selection process. The obtained results indicate that the proposed method has performed well in terms of choosing the optimal subset and classification accuracy on all data sets and in comparison with other methods.

کلیدواژه ها

Feature selection, optimization algorithms, hierarchical algorithm, graph clustering

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.