Skip to main content Skip to main navigation menu Skip to site footer

Уважаемые пользователи! На нашем хостинге ведутся технические работы, на сайте могут быть ошибки. Приносим свои извинения за временные неудобства.

Bulletin of the Abai KazNPU, the series of "Physical and Mathematical Sciences"

STEMMING OF KAZAKH LANGUAGE

Published March 2021
Suleyman Demirel University, Kaskelen,
Suleyman Demirel University, Kaskelen,
Suleyman Demirel University, Kaskelen,
Abstract

Nowadays natural language processing is widely used. For instance, it can be used to translate text, in search
engines systems, text topic identification. Such applications require preprocessing of text. It should be done, because
preprocessing of text can influence on system accuracy. Text preprocessing can be done by several ways. One approach
is identifying root of word. Advantage of identifying root of word is that it can save memory of computer, because
repeated roots will be saved one time. This paper describes stemming systems, which can identify root of word. In
literature review part authors reviewed to stemming algorithms, which can identify roots of words of Russian, Uzbek,
Turkish languages. Then authors proposed stemming system, which can identify root of word of Kazakh language. In
current paper authors describe how their system works. To test the system words from various parts of speech were
entered. Proposed system can identify roots of noun, verb, adjective, numeral words. The system response can be seen
in table 1. Pictures below show what kinds of suffixes, endings can be concatenated with root of word of Kazakh
language. However not all combinations are shown in pictures. In conclusion part advices for how to develop stemming
system are written.

pdf
Language

Eng

How to Cite

[1]
Bogdanchikov, A., Baimuratov, O. and Ayazbayev, D. 2021. STEMMING OF KAZAKH LANGUAGE. Bulletin of the Abai KazNPU, the series of "Physical and Mathematical Sciences". 73, 1 (Mar. 2021), 169–173. DOI:https://doi.org/10.51889/2021-1.1728-7901.24.