A Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
محل انتشار: مجله هوش مصنوعی و داده کاوی، دوره: 8، شماره: 2
سال انتشار: 1399
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 278
فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
این مقاله در بخشهای موضوعی زیر دسته بندی شده است:
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JADM-8-2_007
تاریخ نمایه سازی: 1 مرداد 1399
چکیده مقاله:
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performance in Conditional Random Field-based Persian Named Entity Recognition, a several syntactic features based on dependency grammar along with some morphological and language-independent features have been designed in order to extract suitable features for the learning phase. In this implementation, designed features have been applied to Conditional Random Field to build our model. To evaluate our system, the Persian syntactic dependency Treebank with about 30,000 sentences, prepared in NOOR Islamic science computer research center, has been implemented. This Treebank has Named-Entity tags, such as Person, Organization and location. The result of this study showed that our approach achieved 86.86% precision, 80.29% recall and 83.44% F-measure which are relatively higher than those values reported for other Persian NER methods.
کلیدواژه ها:
نویسندگان
L. Jafar Tafreshi
Computer Research Center of Islamic Sciences (CRCIS), Tehran, Iran.
F. Soltanzadeh
General Linguistics Department, Allameh Tabatabaei University, Tehran, Iran.