Clustering-Based Knowledge Discovery in Breast Cancer: Insights from a Local Clinical Dataset

سال انتشار: 1405
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 22

فایل این مقاله در 28 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JECEI-14-1_011

تاریخ نمایه سازی: 15 بهمن 1404

چکیده مقاله:

kground and Objectives: Understanding the heterogeneity of breast cancer is crucial for improving treatment strategies. This study investigates the application of K-Means and Hierarchical Clustering to a local dataset of breast cancer patients from Iranmehr Hospital, Birjand, Iran, with the primary goal of identifying potential patient subgroups based on their clinical and treatment characteristics for knowledge discovery. The potential of these subgroups to inform future research on personalized treatment approaches is explored.Methods: A retrospective dataset comprising pathological and clinical information was analyzed using K-Means and Agglomerative Hierarchical Clustering to identify patient subgroups. The optimal number of clusters was consistently determined to be two (k=۲) for both methods based on rigorous internal validation metrics (Elbow Method, Silhouette Analysis, Calinski-Harabasz Index, and Largest Jump Analysis for Hierarchical Clustering). Statistical tests (ANOVA and Chi-squared) were employed to assess significant differences in features across the identified clusters from both K-Means and Hierarchical analyses, providing insights into the key factors differentiating these groups. Internal cluster validity was assessed using Silhouette Score and Calinski-Harabasz Index.Results: The K-Means analysis identified two clusters exhibiting significant differences in characteristics such as age, chemotherapy session intensity, menopausal status, nodal involvement, and biomarker expression (ER, PR, HER۲, Ki۶۷). The Hierarchical Clustering also yielded two clusters with varying characteristics, and a comparison between the two methods highlighted both similarities and differences in the identified patient stratifications. The overall agreement between K-Means and Hierarchical Clustering was quantified by an Adjusted Rand Index (ARI) of ۰.۴۶۹۷.Conclusion: Both K-Means and Hierarchical Clustering effectively revealed potential patient subgroups within the studied dataset, highlighting the heterogeneity of breast cancer presentation and treatment at a local level These clusters exhibited statistically significant differences across key clinical and treatment features. Future research is needed to validate these findings in larger, multi-center studies, explore the clinical significance of these subgroups in terms of treatment outcomes, and compare the effectiveness of different clustering methodologies for this purpose.

نویسندگان

Oveis Dehghantanha

Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran.

Nasser Mehrshad

Department of Electrical and Computer Engineering, University of Birjand, Birjand, Iran.

Roksana Bakhshali

Omid Cancer Center, Ahvaz, Iran.

Ahmad Reza Sebzari

Department of Internal Medicine, School of Medicine, Cellular and Molecular Research Center, Valiasr Hospital, Birjand University of Medical Sciences, Birjand, Iran.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :
  • F. Bray, M. Laversanne, E. Weiderpass, I. Soerjomataram, "The ever-increasing ...
  • NCD Countdown ۲۰۳۰ Collaborators, "NCD Countdown ۲۰۳۰: Pathways to achieving ...
  • H. Sung et al., "Global cancer statistics ۲۰۲۰: GLOBOCAN estimates ...
  • A. G. Renehan, M. Tyson, M. Egger, R. F. Heller, ...
  • A. McTiernan et al., "Recreational physical activity and the risk ...
  • M. E. Levine et al., "Low protein intake is associated ...
  • N. Hamajima et al., "Collaborative reanalysis of individual data from ...
  • U.S. Department of Health and Human Services, The Health Consequences ...
  • D. J. Hunter et al., "Oral contraceptive use and breast ...
  • A. K. Jain, “Data clustering: ۵۰ years beyond K-means,” Pattern ...
  • A. Ahmad, L. Dey, “A k-mean clustering algorithm for mixed ...
  • Z. Huang, “Clustering large data sets with mixed numeric and ...
  • D. Xu, Y. Tian, “A comprehensive survey of clustering algorithms,” ...
  • F. Murtagh, P. Legendre, “Ward’s hierarchical agglomerative clustering method: Which ...
  • G. Pison, A. Struyf, P. J. Rousseeuw, "Displaying a clustering ...
  • A. K. Dubey et al., "Analysis of k-means clustering approach ...
  • U. Agrawal, D. Soria, C. Wagner, J. Garibaldi, I. O. ...
  • C.Wang et al., "Breast cancer patient stratification using a molecular ...
  • Z. Sajjadnia et al., "Preprocessing breast cancer data to improve ...
  • A. Ahmadi et al., “Incidence pattern and spatial analysis of ...
  • S. M. Hosseini , M. Parvin , P. Shokri , ...
  • S. Dehdar et al., “Applications of different machine learning approaches ...
  • M. Radak et al., "Machine learning and deep learning techniques ...
  • J. Xiao et al., "The application and comparison of machine ...
  • I. Guyon, A. Elisseeff, "An introduction to variable and feature ...
  • A. Zimek, E. Schubert, H. P. Kriegel, "A survey on ...
  • D. T. Dinh, V. N. Huynh, S. Sriboonchitta, "Clustering mixed ...
  • S. Boluki, S. Zamani Dadaneh, X. Qian, E. R. Dougherty, ...
  • M. Sheller et al., “Federated learning in medicine: Facilitating multi-institutional ...
  • Q. Yang et al., “Federated machine learning: concept and applications,” ...
  • M. Ester, H. Kriegel, J. Sander, X. Xu, “A density-based ...
  • D. A. Reynolds, “Gaussian mixture models,” in Encyclopedia of Biometrics, ...
  • G. L. Gierach et al., "Relationship between mammographic density and ...
  • G. C. Wishart et al., "Screen-detected vs symptomatic breast cancer: ...
  • S. Adams et al., " Prognostic value of tumor-infiltrating lymphocytes ...
  • S. Watanabe, H. Asamura, "Lymph node dissection for lung cancer: ...
  • M. Ferrero-Poüs et al., "Comparison of enzyme immunoassay and immunohistochemical ...
  • K. C. Chu et al., "Frequency distributions of breast cancer ...
  • A. S. Knoop et al., "Value of epidermal growth factor ...
  • C. R. Wenger et al., "DNA ploidy, S-phase, and steroid ...
  • N. Falette et al., "Prognostic value of P۵۳ gene mutations ...
  • R. M. Elledge et al., "Prognostic significance of p۵۳ gene ...
  • I. L. Andrulis et al., "neu/erbB-۲ amplification identifies a poor-prognosis ...
  • A. K. Tandon et al., "HER-۲/neu oncogene protein and prognosis ...
  • M. Ferrero-Poüs et al., "Relationship between c-erb B-۲ and other ...
  • M. Bolla et al., "Estimation of epidermal growth factor receptor ...
  • V. Pawlowski et al., "Prognostic value of the type I ...
  • C. A. Purdie et al., "Progesterone receptor expression is an ...
  • J. P. Thakkar, D. G. Mehta, "A review of an ...
  • J. Anampa, D. Makower, J. A. Sparano, "Progress in adjuvant ...
  • P. A. Francis et al., "Tailoring adjuvant endocrine therapy for ...
  • J. MacQueen, "Some methods for classification and analysis of multivariate ...
  • S. C. Johnson, "Hierarchical clustering schemes," Psychometrika, ۳۲(۳): ۲۴۱-۲۵۴, ۱۹۶۷ ...
  • R. L. Thorndike, "Who belongs in the family?," Psychometrika, ۱۸(۴): ...
  • P. J. Rousseeuw, "Silhouettes: A graphical aid to the interpretation ...
  • T. Caliński, J. Harabasz, "A dendrite method for cluster analysis," ...
  • A. K. Jain, M. N. Murty, P. J. Flynn, "Data ...
  • F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, ...
  • نمایش کامل مراجع