РАСПОЗНАВАНИЕ РЕЧЕВЫХ ЭМОЦИЙ С ПОМОЩЬЮ МАШИННОГО ОБУЧЕНИЯ

With the development of multimedia image recognition technologies, which allows extracting and analyzing large amounts of multimedia information from video and audio sources, there has been a large increase in the use of machine learning technology using deep learning to solve various problems. Speech emotion recognition (or classification) is one of the most complex topics in data science. In this work, we used an MLP classifier-based architecture that extracts chalk-frequency cepstral coefficients, chromograms, chalk-scale spectrograms from audio files and uses these as input to a neural network for emotion identification using samples from the Ryerson Audio-Visual Emotional Speech and Song (RAVDESS). A neural network model was developed to recognize four emotions (calm, anger, fear, disgust). This model classifies speech emotions with an accuracy of 83.33%.

pdf (Русский)

Keywords

voice emotion recognition MLP classifier RAVDESS jupyter notebook Python

Language

Русский

How to Cite

[1]

Yeralkhanova А., Yessenbay М., Mukhtarova А., Zhexebay Д. and Kozhagulov Е. 2022. RECOGNITION OF SPEECH EMOTIONS USING MACHINE LEARNING. Bulletin of Abai KazNPU. Series of Physical and Mathematical sciences. 78, 2 (Jun. 2022), 102–108. DOI:https://doi.org/10.51889/2022-2.1728-7901.13.

RECOGNITION OF SPEECH EMOTIONS USING MACHINE LEARNING

Download Citation