Persian Word Sense Disambiguation using LDA topic model

سال انتشار: 1394
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 937

فایل این مقاله در 9 صفحه با فرمت PDF قابل دریافت می باشد

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ICESCON01_0495

تاریخ نمایه سازی: 25 بهمن 1394

چکیده مقاله:

The Word sense disambiguation is a prominent issue in natural language processing. In this paper, a model is proposed for Persian word sense disambiguation using extraction of new features. To generate this model two groups of features are utilized including words and signs accompanying ambiguous word as well as features derived using topic modeling schemes. A topic model is a probabilistic model for extracting abstract of topics which are included in documents of a corpuse. In the paper at hand unsupervised Latent Dirichlet Allocation method is exploited. Experimental results for four ambiguous popular Persian words extracted from research center of intelligent signal processing corpus, show a precision of 939. It demonstrates the effect of this method on finding proper sense of words.

نویسندگان

Babak Masoudi

Department of information technology, Payamenoor university(PNU),P.O.Box, 59391-3993 Tehran,I.R of Iran

Aboozar Zandvakili

Department of Computer Engineering, College of Engineering, jiroft Branch, Islamic Azad University, jiroft Iran