Stance Detection Dataset for Persian Tweets
سال انتشار: 1401
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 226
فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_ITRC-14-4_006
تاریخ نمایه سازی: 8 بهمن 1401
چکیده مقاله:
Stance detection aims to identify an author's stance towards a specific topic which has become a critical component in applications such as fake news detection, claim validation, author profiling, etc. However, while the stance is easily detected by humans, machine learning models are falling short of this task. In the English language, due to having large and appropriate e datasets, relatively good accuracy has been achieved in this field, but in the Persian language, due to the lack of data, we have not made significant progress in stance detection. So, in this paper, we present a stance detection dataset that contains ۳۸۱۳ labeled tweets. We provide a detailed description of the newly created dataset and develop deep learning models on it. Our best model achieves a macro-average F۱-score of ۵۸%. Moreover, our dataset can facilitate research in some fields in Persian such as cross-lingual stance detection, author profiling, etc.
کلیدواژه ها:
نویسندگان
Mohammad Hadi Bokaei
ICT Research Institute (ITRC) Tehran, Iran
Mojgan Farhoodi
ICT Research Institute (ITRC) Tehran, Iran
Mona Davoudi
ICT Research Institute (ITRC) Tehran, Iran