Development of a CGAN-Based Method for Aspect Level Text Generation: Encouragement and Punishment Factors in the Aspect Knowledge
- سال انتشار: 1402
- محل انتشار: مجله محاسبات و امنیت، دوره: 10، شماره: 1
- کد COI اختصاصی: JR_JCSE-10-1_004
- زبان مقاله: انگلیسی
- تعداد مشاهده: 146
نویسندگان
Department of Computer Engineering, Shahreza Campus, University of Isfahan, Isfahan, Iran.
Department of Computer Engineering, Shahreza Campus, University of Isfahan, Isfahan, Iran.
Department of Artificial Intelligence, Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.
چکیده
Text mining systems may benefit from the use of automated text generation, especially when dealing with limited datasets and linguistic resources. Most successful text generation approaches are generic rather than aspect-specific, resulting in relatively inaccurate and similar sentences in different aspects. The present study proposes a solution to this problem by extracting aspect knowledge from relevant topics and creating the correct phrase based on the Conditional Generative Adversarial Network (CGAN) for each aspect. The proposed method produces sentences using an auxiliary dataset that cannot be distinguished from genuine sentences by the discriminator. In order to generate an auxiliary dataset, aspect-based information from datasets related to the target concept is extracted. To further improve the accuracy, the generator is encouraged or punished depending on the similarity with the training corpus. Two datasets in English and Persian are used to evaluate the performance of the proposed text generation method. The results show that adding similar aspects to the auxiliary dataset improves the quality of the generated sentences. In addition, encouragement leads to more accurate sentences, while punishment leads to more varied sentences.کلیدواژه ها
Deep Learning, Text Generation, Conditional Generative Adversarial Network, Aspect, Long Short-Term Memoryاطلاعات بیشتر در مورد COI
COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.
کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.