Language Identification from Text Using HMMs

Language identification from text has received less attention than identification from other forms of input. This is due to the fact that it is considered an easy problem. Several techniques exist and it is possible to gain perfect accuracy in identifying the language of the text in some methods. Nevertheless, there has been very few works on the accuracy and performance of different techniques with limited input, i.e., accurate detection of the language with less input length and identification of the language of a short sentence or a single word. In this paper, we present a method based on Hidden Markov Models (HMMs) for language identification from text. We use the power of HMMs for detecting language of character strings and show the benefits of using this model over a simple model. We will show how an extremely simple realization of this model outperforms simple models in accurately identifying languages of short input strings.

کلیدواژه ها:

Language Identification ، Text-to-Speech Systems ، Machine Learning ، Hidden Markov Models

نویسندگان

Oktie Hassanzadeh

Department of Computer Science, University of Toronto, Toronto, ON M۵S-۳G۴, Canada

Ehsan Zamiri

Department of Computer Engineering, Ferdowsi University of Mashad, Mashad, Iran

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :

Rabiner, L.R., "A Tutorial on Hidden Markov Models and Selected ...
Saul, L. & Pereira, F., "Aggregate and Mixed-Order Markov Models ...
Cole, R.A., Mariani, J., Uszkoreit, H., Zaenen, A. & Zue ...
Hovy, E., Ide, N., Frederking, R., Mariani, J. & Zampolli, ...
Ahmed, B., Cha S. &Tappert, C. , "Language Identification from ...
Nakagawa, S. & Reyes, A.A., "An Evalation of Language Identification ...
Bengio, Y. _ "Markovian models for sequential data, " Neural ...
Miller, D.R.H., Leek, T., Schwartz, R.M. , "A Hidden Markov ...

نمایش کامل مراجع

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

هوش مصنوعی > یادگیری ماشین

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/44492

شناسه ملی سند علمی:

ACCSI12_105

تاریخ نمایه سازی: 23 دی 1386

نحوه استناد به مقاله:

در صورتی که می خواهید در اثر پژوهشی خود به این مقاله ارجاع دهید، به سادگی می توانید از عبارت زیر در بخش منابع و مراجع استفاده نمایید:

Hassanzadeh, Oktie and Zamiri, Ehsan,1385,Language Identification from Text Using HMMs,12th Annual Conference of Computer Society of Iran,Tehran,https://civilica.com/doc/44492

در داخل متن نیز هر جا که به عبارت و یا دستاوردی از این مقاله اشاره شود پس از ذکر مطلب، در داخل پارانتز، مشخصات زیر نوشته می شود.
برای بار اول: (1385, Hassanzadeh, Oktie؛ Ehsan Zamiri)
برای بار دوم به بعد: (1385, Hassanzadeh؛ Zamiri)
برای آشنایی کامل با نحوه مرجع نویسی لطفا بخش راهنمای سیویلیکا (مرجع دهی) را ملاحظه نمایید.

مقالات مرتبط جدید