A Deep Learning Model for Classifying Quality of User Replies

سال انتشار: 1400
محل انتشار: فصلنامه بین المللی وب پژوهی، دوره: 4، شماره: 1
کد COI اختصاصی: JR_IJWR-4-1_003
زبان مقاله: انگلیسی
تعداد مشاهده: 276

دانلود فایل این مقاله

نویسندگان

Masoumeh Rajabi

Department of Computer Eng., Shahrekord University, Shahrekord, Iran

Shahla Nemati

Department of Computer Engineering, Faculty of Engineering Shahrekord University Shahrekord, Iran

Mohammad Ehsan Basiri

Computer Engineering Dept., Shahrekord University, Shahrekord, Iran

چکیده

Q&A forums are designed to help users in finding useful information and accessing high-quality content posted by other users in text forums. Automatically identifying high-quality replies posted in response to the initial posts not only provides users with appropriate content, but also saves their time. Existing methods for classifying user replies based on their quality, try to extract quality features from both the textual content and metadata of the replies. This feature engineering step is a time and labor-intensive task. The current study addresses this problem by proposing new model based on deep learning for detecting quality user replies using only raw textual content. Specifically, we propose a long short-term memory (LSTM) model that exploits the embeddings from language models (ELMo) for representing words as contextual numerical vectors. We compared the effectiveness of the proposed model with four traditional machine learning models on the TripAdvisor for New York City (NYC) and the Ubuntu Linux distribution online forums datasets. Experimental results indicated that the proposed model significantly outperformed the four traditional algorithms on both datasets. Moreover, the proposed model achieved about ۱۶% higher accuracy compared to that obtained by the traditional algorithms trained on both textual and quality dimension features.

کلیدواژه ها

Text Classification, deep neural networks, Social Media Text Processing, Machine Learning

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.