Speech Emotion Recognition based on Improved SOAR Model
محل انتشار: مجله محاسبات و امنیت، دوره: 11، شماره: 1
سال انتشار: 1403
نوع سند: مقاله ژورنالی
زبان: انگلیسی
مشاهده: 148
فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد
- صدور گواهی نمایه سازی
- من نویسنده این مقاله هستم
استخراج به نرم افزارهای پژوهشی:
شناسه ملی سند علمی:
JR_JCSE-11-1_004
تاریخ نمایه سازی: 2 بهمن 1403
چکیده مقاله:
In recent years, emotion recognition as a new method for human-computer interaction has attracted the attention of researchers. Automatic speech emotion recognition has become one of the practical methods to increase engagement in most industries. It is expected that emotion recognition based on audio information can result in better accuracy. The purpose of this article is to present an efficient method for recognizing emotional states from speech signals, based on a new cognitive model. Due to the importance of the topic, this article presents an efficient method for recognizing emotional states from speech signals based on a mixed deep learning and cognitive model called SOAR. To implement each part of this model, two main steps have been introduced. The first step is reading the video and converting it to images and preprocessing it. The next step is to use the combination of convolutional neural network (CNN) and learning automata (LA) to classify and detect the rate of facial emotional recognition. The reason for choosing CNN in our model is that no dimension is removed from the speech signal and considering the temporal information in dynamic speech leads to more efficient and better classification. Also, the training of the CNN network in calculating the backpropagation error is adjusted by LA so that the efficiency of the proposed model is increased and the working memory part of the SOAR model can be implemented. In the proposed model, audio databases available in the field of multimodal emotion recognition eNTERFACE' ۰۵ and SAVEE have been used for various experiments. The recognition accuracy of the presented model in the best case from eNTERFACE' ۰۵ and SAVEE databases is equal to ۸۵.۳% and ۸۴.۵%, respectively.
کلیدواژه ها:
Speech emotion recognition ، Convolutional Neural Network (CNN) ، Learning Automata ، Improved SOAR model
نویسندگان
Matin Ramzani Shahrestani
Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran.
Sara Motamed
Department of Computer Engineering, Fouman and Shaft Branch, Islamic Azad University, Fouman, Iran.
Mohammadreza Yamaghani
Department of Computer Engineering, Lahijan Branch, Islamic Azad University, Lahijan, Iran.
مراجع و منابع این مقاله:
لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :