ПОВЫШЕНИЕ ТОЧНОСТИ КЛАССИФИКАЦИИ НА НЕСБАЛАНСИРОВАННЫХ ДАННЫХ С ИСПОЛЬЗОВАНИЕМ ГИБРИДНОЙ МОДЕЛИ

A. Skakova; G. Astaubayeva; S. Issabayeva; E. Abdykerimova; A. Tastanbek

doi:10.51889/2959-5894.2026.94.2.024

Vol. 94 No. 2 (2026)

IMPROVING CLASSIFICATION ACCURACY ON IMBALANCED DATA USING A HYBRID MODEL

Published June 2026

0

A. Skakova⁺⁻

Nur-Mubarak Egyptian university of Islamic culture, Almaty, Kazakhstan

G. Astaubayeva⁺⁻

Narxoz University

S. Issabayeva⁺⁻

Kazakh National Academy of Arts named after Temirbek Zhurgenov

E. Abdykerimova⁺⁻

Caspian University of Technology and Engineering named after Sh. Yessenov

A. Tastanbek⁺⁻

Turan university

Nur-Mubarak Egyptian university of Islamic culture, Almaty, Kazakhstan

Narxoz University

Kazakh National Academy of Arts named after Temirbek Zhurgenov

Caspian University of Technology and Engineering named after Sh. Yessenov

Turan university

DOI: 10.51889/2959-5894.2026.94.2.024

Abstract

In the context of the rapid growth of data volumes, the problem of class imbalance has become one of the key challenges in classification tasks, significantly reducing the accuracy and generalization ability of machine learning models. The aim of this study is to improve classification accuracy on imbalanced datasets through the development and application of a hybrid model that combines data preprocessing techniques and ensemble learning methods. To achieve this goal, existing approaches to handling class imbalance were analyzed, including resampling techniques (oversampling and undersampling), cost-sensitive learning, and modern ensemble strategies.

The research methodology is based on the integration of synthetic data generation with gradient boosting and random forest algorithms. This approach enhances sensitivity to the minority class while maintaining model robustness against overfitting. The proposed hybrid model was evaluated on several open-source and applied datasets with varying degrees of class imbalance. The performance assessment was conducted using metrics suitable for imbalanced data, including F1-score, balanced accuracy and other.

The obtained results demonstrate a statistically significant improvement in classification performance compared to baseline models, especially in detecting the minority class. The scientific significance of the study lies in the development of a reproducible approach to improving classification effectiveness under class imbalance conditions, thereby expanding the applicability of machine learning methods in domains such as healthcare, finance, and risk analysis.

Keywords

imbalanced data; classification; hybrid model; machine learning; SMOTE; gradient boosting; F1-score; ROC-AUC.

Language

Русский

How to Cite

[1]

Skakova А., Astaubayeva Г., Issabayeva С., Abdykerimova Э. and Tastanbek А. 2026. IMPROVING CLASSIFICATION ACCURACY ON IMBALANCED DATA USING A HYBRID MODEL. Bulletin of Abai KazNPU. Series of Physical and Mathematical sciences. 94, 2 (Jun. 2026). DOI:https://doi.org/10.51889/2959-5894.2026.94.2.024.

IMPROVING CLASSIFICATION ACCURACY ON IMBALANCED DATA USING A HYBRID MODEL

Download Citation