Farsi Conceptual Text Summarizer: A New Model in Continuous vector Space

سال انتشار: 1398
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 497

فایل این مقاله در 12 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

JR_JIST-7-1_002

تاریخ نمایه سازی: 6 اسفند 1398

چکیده مقاله:

Traditional methods of summarization were very costly and time-consuming. This led to the emergence of automatic methods for text summarization. Extractive summarization is an automatic method for generating summary by identifying the most important sentences of a text. In this paper, two innovative approaches are presented for summarizing the Farsi texts. In these methods, using a combination of deep learning and statistical methods (TFIDF), we cluster the concepts of the text and, based on the importance of the concepts in each sentence, we derive the sentences that have the most conceptual burden. In these methods, we have attempted to address the weaknesses of representation in repetition-based statistical methods by exploiting the unsupervised extraction of association between vocabulary through deep learning. In the first unsupervised method, without using any hand-crafted features, we achieved state-of-the-art results on the Pasokh single-document corpus as compared to the best supervised Farsi methods. In order to have a better understanding of the results, we have evaluated the human summaries generated by the contributing authors of the Pasokh corpus as a measure of the success rate of the proposed methods. In terms of recall, these have achieved favorable results. In the second method, by giving the coefficient of title effect and its increase, the average ROUGE-2 values increased to 0.4% on the Pasokh single-document corpus compared to the first method and the average ROUGE-1 values increased to 3% on the Khabir news corpus

نویسندگان

Mohammad Ebrahim Khademi

Faculty of Electrical and Computer Engineering, Malek Ashtar University of Technology, Iran

Mohammad Fakhredanesh

Faculty of Electrical and Computer Engineering, Malek Ashtar University of Technology, Iran

Seyed Mojtaba Hoseini

Faculty of Electrical and Computer Engineering, Malek Ashtar University of Technology, Iran