This paper addresses a critical limitation of Retrieval-Augmented Generation (RAG) systems in domain-specific code question-answering: imprecise context retrieval actively misleads small language models, degrading accuracy below the baseline. We propose an entity-aware hybrid retrieval architecture that extracts the target code entity (class or function name) directly from each question and applies it as a hard filter on retrieved chunks. The system integrates semantic vector search (ChromaDB) with a kNN-based knowledge graph (NetworkX, 53,585 nodes) fused via Weighted Reciprocal Rank Fusion (WRRF). A two-stage quality-aware fallback mechanism rejects low-relevance context before generation. Evaluation on the srsRANBench dataset (1,502 multiple-choice questions over 500,000+ lines of srsRAN C++ code) shows the Graph-Enhanced system achieving 65.51% accuracy vs. 63.65% for Vector-only, with ROUGE-L of 0.1820-the best among all systems. Error analysis reveals that the knowledge graph rescues 87 vector-only failures. The "RAG hurts" phenomenon-where retrieved context misleads a 3B-parameter LLM-is systematically analyzed and shown to be mitigated by entity-aware filtering.
HYBRID RAG WITH KNOWLEDGE GRAPH FOR 5G/O-RAN CODE DOCUMENTATION
Published June 2026
0
Abstract
Language
English
How to Cite
[1]
Marlambekov, D., Kassymbek, N., Nurakhov, Y., Mukhanbet, A. and Mukhambetzhanov, S. 2026. HYBRID RAG WITH KNOWLEDGE GRAPH FOR 5G/O-RAN CODE DOCUMENTATION. Bulletin of Abai KazNPU. Series of Physical and Mathematical sciences. 94, 2 (Jun. 2026). DOI:https://doi.org/10.51889/2959-5894.2026.94.2.021.