VelvetFlow: An engineering pipeline for robust multi-density clustering

سال انتشار: 1404
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 3

فایل این مقاله در 26 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JDMA-10-4_003

تاریخ نمایه سازی: 8 آذر 1404

چکیده مقاله:

Problem. Real-world datasets seldom respect a single density scale: tight blobs, elongated ribbons, and isolated points often coexist. Classical algorithms such as DBSCAN or \textit{k}-means require domain-specific parameter tuning and provide only ad-hoc support for anomaly detection.Solution. We introduce VelvetFlow, an engineering pipeline that turns a set of well-understood building blocks into a cohesive, end-to-end workflow for multi-density clustering \emph{and} principled outlier detection. The pipeline is composed of three reusable stages:(i) \emph{Contextual-density splitting} assigns every point to a high- or low-density partition using a single neighbourhood size k.(ii) \emph{Density-aware clustering} applies a Jaccard-guided \textit{FusedNeighbor}+DBSCAN routine to the sparse partition and HDBSCAN to the dense partition-without introducing new hyper-parameters.(iii) \emph{Scaled-MST verification} re-examines the complete k-NN graph, flags weakly connected components, and validates them with a k-NN gate; this step recovers small remote clusters while filtering genuine anomalies.

نویسندگان

Hossein Eyvazi

Department of Computer Science, University of Tarbiat Modares, Tehran, I. R. Iran

Mohammad Badzohreh

Department of Computer Science, University of Tarbiat Modares, Tehran, I. R. Iran

Seyed Ali Shahrokhi

Department of Computer Science, University of Tarbiat Modares, Tehran, I. R. Iran