Speech Emotion Recognition based on Improved SOAR Model

Matin Ramzani Shahrestani; Sara Motamed; Mohammadreza Yamaghani

Speech Emotion Recognition based on Improved SOAR Model

محل انتشار: مجله محاسبات و امنیت، دوره: 11، شماره: 1

سال انتشار: 1403

نوع سند: مقاله ژورنالی

زبان: انگلیسی

مشاهده: 274

فایل این مقاله در 10 صفحه با فرمت PDF قابل دریافت می باشد

دریافت فایل کامل مقاله

صدور گواهی نمایه سازی
من نویسنده این مقاله هستم

استخراج به نرم افزارهای پژوهشی:

لینک ثابت به این مقاله:

https://civilica.com/doc/2158608

شناسه ملی سند علمی:

JR_JCSE-11-1_004

تاریخ نمایه سازی: 2 بهمن 1403

چکیده مقاله:

In recent years, emotion recognition as a new method for human-computer interaction has attracted the attention of researchers. Automatic speech emotion recognition has become one of the practical methods to increase engagement in most industries. It is expected that emotion recognition based on audio information can result in better accuracy. The purpose of this article is to present an efficient method for recognizing emotional states from speech signals, based on a new cognitive model. Due to the importance of the topic, this article presents an efficient method for recognizing emotional states from speech signals based on a mixed deep learning and cognitive model called SOAR. To implement each part of this model, two main steps have been introduced. The first step is reading the video and converting it to images and preprocessing it. The next step is to use the combination of convolutional neural network (CNN) and learning automata (LA) to classify and detect the rate of facial emotional recognition. The reason for choosing CNN in our model is that no dimension is removed from the speech signal and considering the temporal information in dynamic speech leads to more efficient and better classification. Also, the training of the CNN network in calculating the backpropagation error is adjusted by LA so that the efficiency of the proposed model is increased and the working memory part of the SOAR model can be implemented. In the proposed model, audio databases available in the field of multimodal emotion recognition eNTERFACE' ۰۵ and SAVEE have been used for various experiments. The recognition accuracy of the presented model in the best case from eNTERFACE' ۰۵ and SAVEE databases is equal to ۸۵.۳% and ۸۴.۵%, respectively.

کلیدواژه ها:

Speech emotion recognition ، Convolutional Neural Network (CNN) ، Learning Automata ، Improved SOAR model

نویسندگان

Matin Ramzani Shahrestani

Department of Computer Engineering, Rasht Branch, Islamic Azad University, Rasht, Iran.

Sara Motamed

Department of Computer Engineering, Fouman and Shaft Branch, Islamic Azad University, Fouman, Iran.

Mohammadreza Yamaghani

Department of Computer Engineering, Lahijan Branch, Islamic Azad University, Lahijan, Iran.

مراجع و منابع این مقاله:

لیست زیر مراجع و منابع استفاده شده در این مقاله را نمایش می دهد. این مراجع به صورت کاملا ماشینی و بر اساس هوش مصنوعی استخراج شده اند و لذا ممکن است دارای اشکالاتی باشند که به مرور زمان دقت استخراج این محتوا افزایش می یابد. مراجعی که مقالات مربوط به آنها در سیویلیکا نمایه شده و پیدا شده اند، به خود مقاله لینک شده اند :

Badshah, A. M. and Ahmad, J. and Rahim, N. and ...
Harimi, A. and Shahzadi, A. and Ahmadyfard, A. and Yaghmaie, ...

نمایش کامل مراجع