AN ON-DEVICE KAZAKH LANGUAGE AGENT FOR HUMAN–ROBOT INTERACTION

Service and assistive robots deployed in Kazakhstan must understand commands in Kazakh, an agglutinative, lower-resource language that general-purpose large language models handle poorly at the small scales that fit on a robot’s on-board computer. We present Farabi-0.6B, a 596-million-parameter Kazakh-centric (Kazakh/Russian/English) language agent obtained by continued pre-training and supervised fine-tuning of Qwen3-0.6B, and study its use as the language-understanding, retrieval, and action-selection core of an intelligent human–robot interface that is small enough to run locally on edge hardware. We make three contributions. First, we describe the interface architecture: a Kazakh command is mapped to an intent, the agent decides whether to invoke a robot skill, request a missing argument, abstain, or seek confirmation, and grounds informational answers in a retrieved manual. Second, we evaluate the language core on a purpose-built, by-construction simulated robot-command benchmark (68 Kazakh/Russian/English commands across five decision categories) and on standardized Kazakh benchmarks. The model maps commands to robot skills with 82% skill-selection and 79% slot-filling accuracy when it acts, and the agentic fine-tuning improves genuine clarification (+20 pp) and out-of-scope abstention (+20 pp) over a pre-agentic baseline; on standardized Kazakh tasks it far exceeds the same-size base (e.g. Belebele-kk 34.0 vs 25.5; FLORES en→kk chrF 37.4 vs 0.0). Third, we characterize edge feasibility: the 0.6B model needs 0.30–1.19 GB for weights and sustains 21 tok/s on a CPU and a projected 57–228 tok/s on named embedded accelerators — comfortably above conversational real time. We also report two honest weaknesses with direct safety implications: the model does not seek confirmation before irreversible actions (100% violation) and under-routes informational queries to retrieval (10%), which motivates an explicit safety-gating layer in the interface rather than reliance on the model alone.

Keywords

human-robot interaction Kazakh language small language model edge AI retrieval-augmented generation tool calling agentic control low-resource NLP

Language

English

How to Cite

[1]

Kadyrbek, N.K. , Mansurova, M.E. , Mosavi, A. and Toiganbaeva, N. 2026. AN ON-DEVICE KAZAKH LANGUAGE AGENT FOR HUMAN–ROBOT INTERACTION. Bulletin of Abai KazNPU. Series of Physical and Mathematical sciences. 94, 2 (Jun. 2026). DOI:https://doi.org/10.51889/2959-5894.2026.94.2.017.

AN ON-DEVICE KAZAKH LANGUAGE AGENT FOR HUMAN–ROBOT INTERACTION

Download Citation