In this article, we explore two neural architectures for gender determination and speaker identification tasks using functions of small-frequency cepstral coefficients (MFCC), which do not cover voice-related characteristics. One of our goals is to compare different neural architectures, multilayer perceptron (MLP) and convolutional neural networks (CNNS) for both tasks with different settings and automatically study gender/speaker–specific features. Experimental results show that models using z-score and Gramian matrix transformation give better results than models using only maximum-minimum MFCC normalization. In terms of training time, MLP requires longer training periods for convergence than CNN. Other experimental results show that MLPs are superior to CNNs in solving both problems in terms of generalization errors.
NEURAL ARCHITECTURES FOR GENDER DETERMINATION AND SPEAKER IDENTIFICATION
Published June 2024
56
32
Abstract
Language
Русский
How to Cite
[1]
Мекебаев , Н., Даркенбаев, Д. and Алтыбай, А. 2024. NEURAL ARCHITECTURES FOR GENDER DETERMINATION AND SPEAKER IDENTIFICATION. Bulletin of Abai KazNPU. Series of Physical and mathematical sciences. 86, 2 (Jun. 2024), 222–234. DOI:https://doi.org/10.51889/2959-5894.2024.86.2.021.