Skip to main content Skip to main navigation menu Skip to site footer

Уважаемые пользователи! На нашем хостинге ведутся технические работы, на сайте могут быть ошибки. Приносим свои извинения за временные неудобства.

Bulletin of the Abai KazNPU, the series of "Physical and Mathematical Sciences"

RECOGNITION OF SPEECH EMOTIONS USING MACHINE LEARNING

Published June 2022
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Al-Farabi Kazakh National University, Almaty, Kazakhstan
Abstract

With the development of multimedia image recognition technologies, which allows extracting and analyzing large amounts of multimedia information from video and audio sources, there has been a large increase in the use of machine learning technology using deep learning to solve various problems. Speech emotion recognition (or classification) is one of the most complex topics in data science. In this work, we used an MLP classifier-based architecture that extracts chalk-frequency cepstral coefficients, chromograms, chalk-scale spectrograms from audio files and uses these as input to a neural network for emotion identification using samples from the Ryerson Audio-Visual Emotional Speech and Song (RAVDESS). A neural network model was developed to recognize four emotions (calm, anger, fear, disgust). This model classifies speech emotions with an accuracy of 83.33%.

pdf (Рус)
Language

Рус

How to Cite

[1]
Ералханова, А., Есенбай, М., Мухтарова, А., Жексебай, Д. and Кожагулов, Е. 2022. RECOGNITION OF SPEECH EMOTIONS USING MACHINE LEARNING. Bulletin of the Abai KazNPU, the series of "Physical and Mathematical Sciences". 78, 2 (Jun. 2022), 102–108. DOI:https://doi.org/10.51889/2022-2.1728-7901.13.