From 7a4ad471c9026c7882504b1c8b730045b4bb74af Mon Sep 17 00:00:00 2001 From: CHEN SHENGYUAN Date: Thu, 18 Dec 2025 15:35:33 +0800 Subject: enable vectorized retrieval with sparse matrix operations --- readme.md | 29 +++++++---------------------- 1 file changed, 7 insertions(+), 22 deletions(-) (limited to 'readme.md') diff --git a/readme.md b/readme.md index 00cb8a2..9dbb618 100644 --- a/readme.md +++ b/readme.md @@ -1,4 +1,4 @@ -# **LinearRAG: Linear Graph Retrieval-Augmented Generation on Large-scale Corpora** +# **LinearRAG: Linear Graph Retrieval-Augmented Generation on Large-scale Corpora** > A relation-free graph construction method for efficient GraphRAG. It eliminates LLM token costs during graph construction, making GraphRAG faster and more efficient than ever. @@ -17,19 +17,16 @@ --- ## 🚀 **Highlights** - -- ✅ **Context-Preserving**: Relation-free graph construction, relying on lightweight entity recognition and semantic linking to achieve comprehensive contextual comprehension. +- ✅ **Context-Preserving**: Relation-free graph construction, relying on lightweight entity recognition and semantic linking to achieve comprehensive contextual comprehension. - ✅ **Complex Reasoning**: Enables deep retrieval via semantic bridging, achieving multi-hop reasoning in a single retrieval pass without requiring explicit relational graphs. - ✅ **High Scalability**: Zero LLM token consumption, faster processing speed, and linear time/space complexity. - +

Framework Overview

--- - ## 🎉 **News** - - **[2025-10-27]** We release **[LinearRAG](https://github.com/DEEP-PolyU/LinearRAG)**, a relation-free graph construction method for efficient GraphRAG. - **[2025-06-06]** We release **[GraphRAG-Bench](https://github.com/GraphRAG-Bench/GraphRAG-Benchmark.git)**, the benchmark for evaluating GraphRAG models. - **[2025-01-21]** We release the **[GraphRAG survey](https://github.com/DEEP-PolyU/Awesome-GraphRAG)**. @@ -38,7 +35,7 @@ ## 🛠️ **Usage** -### 1️⃣ Install Dependencies +### 1️⃣ Install Dependencies **Step 1: Install Python packages** @@ -53,7 +50,6 @@ python -m spacy download en_core_web_trf ``` > **Note:** For the `medical` dataset, you need to install the scientific/biomedical Spacy model: - ```bash pip install https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.3/en_core_sci_scibert-0.5.3.tar.gz ``` @@ -82,6 +78,7 @@ Make sure the embedding model is available at: model/all-mpnet-base-v2/ ``` + ### 2️⃣ Quick Start Example ```bash @@ -97,6 +94,7 @@ python run.py \ --dataset_name ${DATASET_NAME} \ --llm_model ${LLM_MODEL} \ --max_workers ${MAX_WORKERS} + --use_vectorized_retrieval # optional, use vectorized matrix-based retrieval for GPU acceleration if Strong GPU is available, otherwise use BFS iteration. ``` ## 🎯 **Performance** @@ -105,26 +103,17 @@ python run.py \ framework **Main results of end-to-end performance** -
framework - - - -![framework](figure/efficiency_result.png) - -![framework](figure/efficiency_result.png) - **Efficiency and performance comparison.** -
+ ## 📖 Citation If you find this work helpful, please consider citing us: - ```bibtex @article{zhuang2025linearrag, title={LinearRAG: Linear Graph Retrieval Augmented Generation on Large-scale Corpora}, @@ -133,9 +122,5 @@ If you find this work helpful, please consider citing us: year={2025} } ``` - -This project is licensed under the GNU General Public License v3.0 ([License](LICENSE.TXT)). - ## 📬 Contact - ✉️ Email: zhuangluyao523@gmail.com -- cgit v1.2.3