Detecting Hate-speech in the Text using Natural Language Processing and Machine Learning

Automatic hate-speech detection from big and almost growing content of social media is a challenge. In the recent years it has been proven that the use of Natural LanguageProcessing methods in combination with Machine Learning algorithms to detect hate-speech from other instances of offensive language outperforms other approaches. This paper empirically studies the application of AdaBoost meta-algorithm to boost performance of hate-speech detection problem in conjunction with Support Vector Machine and Decision Tree as weak learners. The execution of AdaBoost with Support Vector Machine as the classifier on a Twitter dataset achieved higher accuracy in comparison to Decision Tree as the classifier. Moreover, it is observed that the accuracy of the AdaBoost classification method is higher than the Logistic Regression algorithm, which has thehighest accuracy among all the classification algorithms for the hate-speech problem in the given Twitter dataset.

کلیدواژه ها:

Machine Learning ، Hate Speech Detection ، Natural Language Processing ، Ensemble Classification ، AdaBoost

نویسندگان

Ebrahim Khalil Abbasi

Farhangian University Tehran, Iran

Roya Amini

Freelance Researcher Kurdistan, Iran

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/1307705

شناسه ملی سند علمی:

CONFSKU01_049

تاریخ نمایه سازی: 17 آبان 1400

نحوه استناد به مقاله:

در صورتی که می خواهید در اثر پژوهشی خود به این مقاله ارجاع دهید، به سادگی می توانید از عبارت زیر در بخش منابع و مراجع استفاده نمایید:

Abbasi, Ebrahim Khalil and Amini, Roya,1400,Detecting Hate-speech in the Text using Natural Language Processing and Machine Learning,National Conference on the Latest Achievements in Data Engineering and Soft Knowledge and Computing,Shahrekord,https://civilica.com/doc/1307705

در داخل متن نیز هر جا که به عبارت و یا دستاوردی از این مقاله اشاره شود پس از ذکر مطلب، در داخل پارانتز، مشخصات زیر نوشته می شود.
برای بار اول: (1400, Abbasi, Ebrahim Khalil؛ Roya Amini)
برای بار دوم به بعد: (1400, Abbasi؛ Amini)
برای آشنایی کامل با نحوه مرجع نویسی لطفا بخش راهنمای سیویلیکا (مرجع دهی) را ملاحظه نمایید.

علم سنجی و رتبه بندی مقاله

مشخصات مرکز تولید کننده این مقاله به صورت زیر است:

رتبه علمی دانشگاه فرهنگیان

نوع مرکز: دانشگاه دولتی

تعداد مقالات: 59,785

در بخش علم سنجی پایگاه سیویلیکا می توانید رتبه بندی علمی مراکز دانشگاهی و پژوهشی کشور را بر اساس آمار مقالات نمایه شده مشاهده نمایید.

مقالات مرتبط جدید