# **LinearRAG: Linear Graph Retrieval-Augmented Generation on Large-scale Corpora** > A relation-free graph construction method for efficient GraphRAG. It eliminates LLM token costs during graph construction, making GraphRAG faster and more efficient than ever.

arXiv:2506.08938 HuggingFace GitHub

--- ## 🚀 **Highlights** - ✅ **Relation-free Graph Construction**: Eliminates unstable relation extraction, using only lightweight entity recognition to graph. - 🔥 **Zero Token Consumption**: Complete graph construction and retrieval without any LLM calls. - 📊 **Strong Results**: Outperforms previous RAG methods on widely-used benchmarks.

Framework Overview

--- ## 🎉 **News** - [2025-10-26] We have released the code and [dataset](https://huggingface.co/datasets/Zly0523/linear-rag). - [2025-10-11] We have released the paper on [Arxiv](https://arxiv.org/abs/2510.10114). --- ## 🛠️ **Usage** ### 1️⃣ Install Dependencies **Step 1: Install Python packages** ```bash pip install -r requirements.txt ``` **Step 2: Download Spacy language model** ```bash python -m spacy download en_core_web_trf ``` > **Note:** For the `medical` dataset, you need to install the scientific/biomedical Spacy model: ```bash pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.3/en_core_sci_scibert-0.5.3.tar.gz ``` **Step 3: Set up your OpenAI API key** ```bash export OPENAI_API_KEY="your-api-key-here" export OPENAI_BASE_URL="your-base-url-here" ``` **Step 4: Download Datasets** Download the datasets from HuggingFace and place them in the `dataset/` folder: ```bash git clone https://huggingface.co/datasets/Zly0523/linear-rag cp -r linear-rag/dataset/* dataset/ ``` **Step 5: Prepare Embedding Model** Make sure the embedding model is available at: ``` model/all-mpnet-base-v2/ ``` ### 2️⃣ Quick Start Example ```bash SPACY_MODEL="en_core_web_trf" EMBEDDING_MODEL="model/bge-large-en-v1.5" DATASET_NAME="2wikimultihop" LLM_MODEL="gpt-4o-mini" MAX_WORKERS=16 python run.py \ --spacy_model ${SPACY_MODEL} \ --embedding_model ${EMBEDDING_MODEL} \ --dataset_name ${DATASET_NAME} \ --llm_model ${LLM_MODEL} \ --max_workers ${MAX_WORKERS} ``` ## 🎯 **Performance**
framework **Main results of end-to-end performance**
framework **Efficiency and performance comparison.**
## 📖 Citation If you find this work helpful, please consider citing us: ```bibtex @article{zhuang2025linearrag, title={LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora}, author={Zhuang, Luyao and Chen, Shengyuan and Xiao, Yilin and Zhou, Huachi and Zhang, Yujing and Chen, Hao and Zhang, Qinggang and Huang, Xiao}, journal={arXiv preprint arXiv:2510.10114}, year={2025} } ``` ## 📬 Contact ✉️ Email: zhuangluyao523@gmail.com