Repeated Record Ordering for Constrained Size Clustering

R. Mortazavi

Repeated Record Ordering for Constrained Size Clustering

محل انتشار: ماهنامه بین المللی مهندسی، دوره: 33، شماره: 7

سال انتشار: 1399

نوع سند: مقاله ژورنالی

زبان: انگلیسی

مشاهده: 454

فایل این مقاله در 8 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1040948

شناسه ملی سند علمی:

JR_IJE-33-7_013

تاریخ نمایه سازی: 4 شهریور 1399

چکیده مقاله:

One of the main techniques used in data mining is data clustering, which has many applications in computer science, biology and social sciences. Constrained clustering is a type of clustering in which side information provided by the user is incorporated into current clustering algorithms. One of the well researched constrained clustering algorithms is called microaggregation. In a microaggregation technique, the algorithm divides the dataset into groups containing at least members, where is a user-defined parameter. The main application of microaggregation is in Statistical Disclosure Control (SDC) for privacy preserving data publishing. A microaggregation algorithm is qualified based on the sum of within-group squared error, . Unfortunately, it has been proven that the optimal microaggregation problem is NP-Hard in general, but the special univariate case can be solved optimally in polynomial time. Many heuristics exist for the general case of the problem that are founded on the univariate case. These techniques order multivariate records in a sequence. This paper proposes a novel method for record ordering. Starting from a conventional clustering algorithm, the proposed method repeatedly puts multivariate records into a sequence and then clusters them again. The process is repeated until no improvement is achieved. Extensive experiments have been conducted in this research to confirm the effectiveness of the proposed method for different parameters and datasets.

کلیدواژه ها:

Constrained Clustering ، Microaggregation ، data privacy

نویسندگان

R. Mortazavi

School of Engineering, Damghan University, Damghan, Iran