PTokenizer: POS Tagger Tokenizer

By the advent of new information sources and the expansion of text data, natural language processing (NLP) has become one of the key parts of all the systems dealing with human written texts, and part of speech (POS) tagging is an inseparable part of all NLP tasks. As a result, it is of the paramount importance to enhance the accuracy of POS tagging. In this paper, applying language model and statistical information, we introduce a new approach to tokenize sentences and prepare them to be labeled by POS taggers. An evaluation shows that the proposed method yields a precision of 98 percent for tokenizing, and

کلیدواژه ها:

Tokenizer ، Part of Speech Tagging ، Probabilistic Model ، Compound Tokens

نویسندگان

Saeed Rahmani

Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran

Seyyed Mostafa Fakhrahmad

Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran

Mohammad Hadi Sadredini

Department of Computer and IT Engineering, Shiraz University, Shiraz, Iran

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/589369

شناسه ملی سند علمی:

JR_JKBEI-2-7_006

تاریخ نمایه سازی: 9 خرداد 1396

نحوه استناد به مقاله:

در صورتی که می خواهید در اثر پژوهشی خود به این مقاله ارجاع دهید، به سادگی می توانید از عبارت زیر در بخش منابع و مراجع استفاده نمایید:

Rahmani, Saeed and Fakhrahmad, Seyyed Mostafa and Sadredini, Mohammad Hadi,1395,PTokenizer: POS Tagger Tokenizer,https://civilica.com/doc/589369

در داخل متن نیز هر جا که به عبارت و یا دستاوردی از این مقاله اشاره شود پس از ذکر مطلب، در داخل پارانتز، مشخصات زیر نوشته می شود.
برای بار اول: (1395, Rahmani, Saeed؛ Seyyed Mostafa Fakhrahmad and Mohammad Hadi Sadredini)
برای بار دوم به بعد: (1395, Rahmani؛ Fakhrahmad and Sadredini)
برای آشنایی کامل با نحوه مرجع نویسی لطفا بخش راهنمای سیویلیکا (مرجع دهی) را ملاحظه نمایید.

علم سنجی و رتبه بندی مقاله

مشخصات مرکز تولید کننده این مقاله به صورت زیر است:

رتبه علمی دانشگاه شیراز

نوع مرکز: دانشگاه دولتی

تعداد مقالات: 33,054

در بخش علم سنجی پایگاه سیویلیکا می توانید رتبه بندی علمی مراکز دانشگاهی و پژوهشی کشور را بر اساس آمار مقالات نمایه شده مشاهده نمایید.

مقالات مرتبط جدید