Outlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator
- سال انتشار: 1398
- محل انتشار: مجله هوش مصنوعی و داده کاوی، دوره: 7، شماره: 2
- کد COI اختصاصی: JR_JADM-7-2_008
- زبان مقاله: انگلیسی
- تعداد مشاهده: 431
نویسندگان
Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.
Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.
چکیده
The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus, the idea of Debruyne’s outlier map is employed in this paper to identify the outlying points in the SVM classification problem. However, due to the computational reasons such as convenience and rapidity, a robust Mahalanobis distance based on the minimum covariance determinant estimator is utilized. This method has a good compatibility by the data with low dimensional structure. In addition to the classification accuracy, the margin width is used as the criterion for the performance assessment. The larger margin is more desired, due to the higher generalization ability. It should be noted that, by omission of the detected outliers using the suggested outlier map the generalization ability and accuracy of SVM are increased. This leads to the conclusion that the proposed method is very efficient in identifying the outliers. The capability of recognizing the outlying and misclassified observations for this new version of outlier map has been retained similar to the older version, which is tested on the simulated and real world data.کلیدواژه ها
Support Vector Machine, Outlying/Misclassified points, Robust statistics, Mahalanobis Distance, Minimum Covariance Determinant estimatorاطلاعات بیشتر در مورد COI
COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.
کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.