Text summarization using graph theory and machine translation techniques

سال انتشار: 1396
محل انتشار: اولین کنفرانس ملی نوآوری در فناوری مهندسی برق و کامپیوتر (IECT-2017)
کد COI اختصاصی: IECT01_020
زبان مقاله: انگلیسی
تعداد مشاهده: 570

نویسندگان

Kharazmi International Campus Shahrood University Shahrood. Iran

چکیده

Text summarization is condensing the input text into a shorter one by preserving its main information contentand overall concept. By increasing public access to web information, information retrieval techniques have found highimportance and it is also very difficult for human beings to summarize manually large documents. So automatic textsummarization is one of the most attractive issues in natural language processing and has fundamental role in conceptunderstanding time reduction. Text summarization methods can be classified into extractive and abstractivesummarization. An extractive summarization method consists of selecting important sentences and paragraph from theoriginal document and merging them into shorter form and abstractive summarization method attempts to develop anunderstanding of the main concept of the original text and re-telling it in less words and sentences. An importantproblem in extractive summary is output sentence ordering. Recent research works on extractive-summary generationemploy sequence ordering in original document, but few works indicate how to select and reorder remaining andrelevant sentence in output document. It is clear that by deleting some unimportant sentences, the sentence ordering isdisrupted. In this paper we address a novel automatic summarization method to combine graph theory and machinetranslation algorithms in order to sentences alignment in summary text.

کلیدواژه ها

text summarization, graph theory, machine translation, extractive summarization

مقالات مرتبط جدید

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.