Neural Architecture for Persian Named Entity Recognition

سال انتشار: 1397
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 511

نسخه کامل این مقاله ارائه نشده است و در دسترس نمی باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

SPIS04_039

تاریخ نمایه سازی: 16 اردیبهشت 1398

چکیده مقاله:

Named entity recognition task is challenging problem specially for languages with low resources of annotated corpora such as Persian. In this paper, we use deep learning approaches to obtain high performance results without using any specific hand-made features and gazetteers. In the recent years, LSTMs have been observed as the most effective solution for sequence prediction problems like NER. For this purpose, we use LSTM-CRF approach for Persian NER and combine with the pre-trained word embedding. For achieving language independent vector representation, we use character-based representation. By using bidirectional LSTM, word and character level features automatically detect and the need for most of hand-engineered features eliminates. This model obtains 86.55% F1 score that is the highest score in compare to previous results in Persian language. We also apply this architecture for dataset in [2] and achieve 84.23% F1 score, published result for this dataset is showed 77.45%F1 score.

نویسندگان