Recognition of Persian digits from zero to nine using acoustic images based on Mel Capstrom coefficients and neural network

سال انتشار: 1400
نوع سند: مقاله کنفرانسی
زبان: انگلیسی
مشاهده: 262

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

این مقاله در بخشهای موضوعی زیر دسته بندی شده است:

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

شناسه ملی سند علمی:

ICTBC04_024

تاریخ نمایه سازی: 5 شهریور 1400

چکیده مقاله:

In this article, first, the database of zero to nine Persian digits has been recorded and collected using the voices of ۵۰ men and women in the environment. In the proposed method, we first frame the preprocessed signal and then go through the improved window, in the next step it enters the Fourier transform block. Now the Fourier transform spectrum is given to the Gaussian filter bank, and then the output power spectrum of the Gaussian bank filter is passed through Root Function, and then by applying cosine transform to compress the components, Mel-Capstrom coefficients are obtained. Finally, the acoustic image is formed as a matrix containing the temporal and frequency features of the speech signal using a two-dimensional inverse Fourier transform of the Mel Capstrom coefficient matrix. To classify and test the data, the features obtained are trained using an improved algorithm in the perceptron neural network with two hidden layers, and the recognition rate is reported at the end. The test results for the signal to different noises show the improvement of the noise signal detection rate by the proposed method, so that the recognition rate of the proposed algorithm without noise is ۹۸.۸۵.

نویسندگان

Seyed Mehdi Hoseini

Department of Computer Science, University of Mazandaran, Babolsar, Iran