KEYWORD EXTRACTION FROM KAZAKH TEXT WITH MACHINE LEARNING ALGORITHMS

А.A. Abibullayeva; G.N. Kazbekova; N.M. Zhunissov

doi:10.51889/2959-5894.2024.85.1.010

Vol. 85 No. 1 (2024)

KEYWORD EXTRACTION FROM KAZAKH TEXT WITH MACHINE LEARNING ALGORITHMS

Published March 2024

281

511

А.A. Abibullayeva⁺⁻

Khoja Ahmet Yassawi International Kazakh-Turkish University, Turkestan, Kazakhstan

G.N. Kazbekova⁺⁻

Khoja Ahmet Yassawi International Kazakh-Turkish University, Turkestan, Kazakhstan

N.M. Zhunissov⁺⁻

Khoja Ahmet Yassawi International Kazakh-Turkish University, Turkestan, Kazakhstan

##plugins.generic.jatsParser.article.authorBio##

×

G.N. Kazbekova

Head of the Department of Computer Engineering

Khoja Ahmet Yassawi International Kazakh-Turkish University, Turkestan, Kazakhstan

DOI: 10.51889/2959-5894.2024.85.1.010

Abstract

Browsing information on the internet in daily life has become a common activity for computer users. Since thousands of Internet news are published on the Internet every day, it is difficult to effectively retrieve and summarize the relevant documents. Therefore, the keyword or keyphrase extraction technique is used to provide the main content of a particular web page. Due to such needs, the use of keywords allows the reader to access the sought-after information easily and quickly. In this article, Random Forest and XgBoost (Extreme Gradient Boosting) algorithms, which are machine learning algorithms, were tested The results were obtained on the 500N-KPCrowd dataset, which consists of English-language news content widely used in the literature, and compared with the results obtained from the Kazakh language datasets. For the Kazakh data set, the highest result in the literature was achieved with the best F₁ score of 0.97. For the 500N-KPCrowd data set, the best F₁ score of 0.70 was obtained.

pdf

Keywords

keyword extraction; machine learning; Random Forest; XgBoost; statistical features; graphical features.

Language

English

How to Cite

[1]

Abibullayeva А., Kazbekova, G. and Zhunissov, N. 2024. KEYWORD EXTRACTION FROM KAZAKH TEXT WITH MACHINE LEARNING ALGORITHMS. Bulletin of Abai KazNPU. Series of Physical and Mathematical sciences. 85, 1 (Mar. 2024), 106–113. DOI:https://doi.org/10.51889/2959-5894.2024.85.1.010.

KEYWORD EXTRACTION FROM KAZAKH TEXT WITH MACHINE LEARNING ALGORITHMS

G.N. Kazbekova

Download Citation