CIVILICA We Respect the Science
(ناشر تخصصی کنفرانسهای کشور / شماره مجوز انتشارات از وزارت فرهنگ و ارشاد اسلامی: ۸۹۷۱)

A Weighted Semantic Similarity Technique for Document Clustering

عنوان مقاله: A Weighted Semantic Similarity Technique for Document Clustering
شناسه ملی مقاله: ICEEE04_181
منتشر شده در چهارمین کنفرانس مهندسی برق و الکترونیک ایران در سال 1391
مشخصات نویسندگان مقاله:

Nasrin Malakooti - School of Electrical & Computer EngineeringShiraz UniversityShiraz, Iran
Ali Hamzeh - School of Electrical & Computer EngineeringShiraz UniversityShiraz, Iran

خلاصه مقاله:
Document similarity has formed an important area in information retrieval and the document mining domains. In other words, having an accurate document similarity measure has a great influence on our performance in these areas. Although many proposed document similaritymethods suffer from several shortcomings. Therefore, this paper focuses on the problem of text document clusteringand has proposed a weighted graph algorithm to improve previous proposed methods. In this study, a weighted method which is based on the semantic relation of document's words is introduced. Using this relation helps us to capture the meaning of documents more accurately. Final results confirm that our method outperforms othercompared methods

کلمات کلیدی:
document mining, document clustering, similarity measurement, semantic meaning

صفحه اختصاصی مقاله و دریافت فایل کامل: https://civilica.com/doc/164258/