Speech Recognition System Based on Machine Learning in Persian Language

سال انتشار: 1401
محل انتشار: مجله الگوریتم های محاسباتی و ابعاد عددی، دوره: 1، شماره: 2
کد COI اختصاصی: JR_CAND-1-2_003
زبان مقاله: انگلیسی
تعداد مشاهده: 246

دانلود فایل این مقاله

نویسندگان

Shahed Mohammadi

Department of Computer Since and Systems Engineering, Ayandegan Institute of Higher Education, Tonekabon, Iran.

Niloufar Hemati

Department of Computer Science, Islamic Azad University Central Tehran Branch, Tehran, Iran.

Neda Mohammadi

Department of Industrial Engineering, Sadra University, Tehran, Iran.

چکیده

In today's world, where speech recognition has become an integral part of our daily lives, the need for systems equipped with this technology has increased dramatically in the past few years. This research aims to locate the two selected Persian words in any given audio file. For this purpose, two standard and native datasets were prepared for this model one for train and the other for the test. Both datasets were converted into images of audio waveforms. Using the object detection technique, the model could extract different bounding boxes for each test audio, and then each box image goes through a CNN classifier and returns a corresponding label. Finally, a threshold is set so that only boxes with high accuracy are displayed as output. The results showed ۹۳% accuracy for the CNN classifier and ۵۰% accuracy for testing the model with object detection.

کلیدواژه ها

Speech recognition, Signal processing, object detection, Neural Network, Deep Learning

اطلاعات بیشتر در مورد COI

COI مخفف عبارت CIVILICA Object Identifier به معنی شناسه سیویلیکا برای اسناد است. COI کدی است که مطابق محل انتشار، به مقالات کنفرانسها و ژورنالهای داخل کشور به هنگام نمایه سازی بر روی پایگاه استنادی سیویلیکا اختصاص می یابد.

کد COI به مفهوم کد ملی اسناد نمایه شده در سیویلیکا است و کدی یکتا و ثابت است و به همین دلیل همواره قابلیت استناد و پیگیری دارد.