Extracting Prior Knowledge from Data Distribution to Migrate from Blind to Semi-Supervised Clustering

سال انتشار: 1397
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 293

فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JADM-6-2_005

تاریخ نمایه سازی: 19 تیر 1398

چکیده مقاله:

Although many studies have been conducted to improve the clustering efficiency, most of the state-of-art schemes suffer from the lack of robustness and stability. This paper is aimed at proposing an efficient approach to elicit prior knowledge in terms of must-link and cannot-link from the estimated distribution of raw data in order to convert a blind clustering problem into a semi-supervised one. To estimate the density distribution of data, Wiebull Mixture Model (WMM) is utilized due to its high flexibility. Another contribution of this study is to propose a new hill and valley seeking algorithm to find the constraints for semi-supervise algorithm. It is assumed that each density peak stands on a cluster center; therefore, neighbor samples of each center are considered as must-link samples while the near centroid samples belonging to different clusters are considered as cannot-link ones. The proposed approach is applied to a standard image dataset (designed for clustering evaluation) along with some UCI datasets. The achieved results on both databases demonstrate the superiority of the proposed method compared to the conventional clustering methods.

کلیدواژه ها:

Semi-supervised ، Clustering ، Valley seeking scheme ، Weibull mixture model (WMM)

نویسندگان

Z. Sedighi

Electrical & Computer Department, Shiraz University, Shiraz, Iran.

R. Boostani

Electrical & Computer Department, Shiraz University, Shiraz, Iran.