НЕЙРОННЫЕ АРХИТЕКТУРЫ ДЛЯ ОПРЕДЕЛЕНИЯ ПОЛА И ИДЕНТИФИКАЦИИ ГОВОРЯЩЕГО

N. Mekebayev; D. Darkenbayev; A. Altybay

doi:10.51889/2959-5894.2024.86.2.021

Vol. 86 No. 2 (2024)

NEURAL ARCHITECTURES FOR GENDER DETERMINATION AND SPEAKER IDENTIFICATION

Published June 2024

90

65

N. Mekebayev⁺⁻

Kazakh National Women's Teacher Training University, Almaty, Kazakhstan

D. Darkenbayev⁺⁻

Al-Farabi Kazakh National University, Almaty, Kazakhstan

A. Altybay⁺⁻

Al-Farabi Kazakh National University, Almaty, Kazakhstan

Kazakh National Women's Teacher Training University, Almaty, Kazakhstan

Al-Farabi Kazakh National University, Almaty, Kazakhstan

DOI: 10.51889/2959-5894.2024.86.2.021

Abstract

In this article, we explore two neural architectures for gender determination and speaker identification tasks using functions of small-frequency cepstral coefficients (MFCC), which do not cover voice-related characteristics. One of our goals is to compare different neural architectures, multilayer perceptron (MLP) and convolutional neural networks (CNNS) for both tasks with different settings and automatically study gender/speaker–specific features. Experimental results show that models using z-score and Gramian matrix transformation give better results than models using only maximum-minimum MFCC normalization. In terms of training time, MLP requires longer training periods for convergence than CNN. Other experimental results show that MLPs are superior to CNNs in solving both problems in terms of generalization errors.

pdf (Русский)

Keywords

MLP, CNN, ASR; NN, gender determination; speaker identification.

Language

Русский

How to Cite

[1]

Mekebayev Н., Darkenbayev Д. and Altybay А. 2024. NEURAL ARCHITECTURES FOR GENDER DETERMINATION AND SPEAKER IDENTIFICATION. Bulletin of Abai KazNPU. Series of Physical and mathematical sciences. 86, 2 (Jun. 2024), 222–234. DOI:https://doi.org/10.51889/2959-5894.2024.86.2.021.

NEURAL ARCHITECTURES FOR GENDER DETERMINATION AND SPEAKER IDENTIFICATION

Download Citation