1 files changed, 36 insertions, 0 deletions
diff --git a/CLAUDE.md b/CLAUDE.md
index b7b4ccd..34394d1 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -349,3 +349,39 @@ pkill -f "vllm.entrypoints"
 
 For questions about this codebase, refer to the experiment plan at:
 `/u/yurenh2/.claude/plans/effervescent-mapping-ocean.md`
+
+---
+
+## Future Improvements (To Try)
+
+### Retrieval Quality Improvements
+
+**Problem Identified**: Current retrieval uses raw user message as query (e.g., "shortest palindrome"), but this doesn't match well with preference descriptions (e.g., "break down code with explanations"). Reranker matches task content, not preference applicability.
+
+**Proposed Solutions**:
+
+#### 1. Query Transformation
+Instead of using raw user message as retrieval query, construct preference-oriented queries:
+- Option A: Use LLM to generate "what user preferences might apply to this task?"
+- Option B: Append task-type keywords to query (e.g., "code explanation preferences for: shortest palindrome")
+- Option C: Multi-query retrieval - one for task content, one for task type/category
+
+#### 4. Global vs Conditional Preferences  
+Separate preferences into two tiers:
+- **Global preferences**: High-frequency, always-applicable (e.g., "always use numbered steps", "use Python for code")
+  - Always include in context, no retrieval needed
+  - Identify via frequency analysis or explicit "When general" condition
+- **Conditional preferences**: Context-specific (e.g., "when debugging, focus on specific issue")
+  - Only these need retrieval based on task context
+  - Reduces retrieval burden and ensures universal preferences never missed
+
+**Implementation Notes**:
+- Can be tested as ablation after current experiments complete
+- Evaluate by: enforcement rate reduction, retrieval recall of actually-enforced preferences
+
+---
+
+## RAG Improvement Ideas
+
+See [docs/rag_improvement_ideas.md](docs/rag_improvement_ideas.md) for detailed brainstorming on how to improve RAG retrieval quality and reduce timeout rate.
+