Add RAG rewrite, 60-session experiment scripts, and analysis tools

- RAG rewrite adapter and vector preference pipeline in personalized_llm - 60-session experiment queue scripts (reflection, rag, rag_vector, rag_rewrite) - Vector-preference correlation analysis and visualization scripts - Local reward model batch processing improvements - Updated CLAUDE.md with full experiment documentation and notes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
author: YurenHao0426 <blackhao0426@gmail.com> 2026-02-10 20:16:36 +0000
committer: YurenHao0426 <blackhao0426@gmail.com> 2026-02-10 20:16:36 +0000
commit: 5626080ca4c4219aec4888d6b9406d0d3349fb55 (patch)
tree: 86287d9fd5833e11ccd78566992540f2664fd195 /CLAUDE.md
parent: a2036838807428424bbbaff507a6563749a83145 (diff)
1 files changed, 36 insertions, 0 deletions
diff --git a/CLAUDE.md b/CLAUDE.md
index b7b4ccd..34394d1 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -349,3 +349,39 @@ pkill -f "vllm.entrypoints"
 
 For questions about this codebase, refer to the experiment plan at:
 `/u/yurenh2/.claude/plans/effervescent-mapping-ocean.md`
+
+---
+
+## Future Improvements (To Try)
+
+### Retrieval Quality Improvements
+
+**Problem Identified**: Current retrieval uses raw user message as query (e.g., "shortest palindrome"), but this doesn't match well with preference descriptions (e.g., "break down code with explanations"). Reranker matches task content, not preference applicability.
+
+**Proposed Solutions**:
+
+#### 1. Query Transformation
+Instead of using raw user message as retrieval query, construct preference-oriented queries:
+- Option A: Use LLM to generate "what user preferences might apply to this task?"
+- Option B: Append task-type keywords to query (e.g., "code explanation preferences for: shortest palindrome")
+- Option C: Multi-query retrieval - one for task content, one for task type/category
+
+#### 4. Global vs Conditional Preferences  
+Separate preferences into two tiers:
+- **Global preferences**: High-frequency, always-applicable (e.g., "always use numbered steps", "use Python for code")
+  - Always include in context, no retrieval needed
+  - Identify via frequency analysis or explicit "When general" condition
+- **Conditional preferences**: Context-specific (e.g., "when debugging, focus on specific issue")
+  - Only these need retrieval based on task context
+  - Reduces retrieval burden and ensures universal preferences never missed
+
+**Implementation Notes**:
+- Can be tested as ablation after current experiments complete
+- Evaluate by: enforcement rate reduction, retrieval recall of actually-enforced preferences
+
+---
+
+## RAG Improvement Ideas
+
+See [docs/rag_improvement_ideas.md](docs/rag_improvement_ideas.md) for detailed brainstorming on how to improve RAG retrieval quality and reduce timeout rate.
+
author	YurenHao0426 <blackhao0426@gmail.com>	2026-02-10 20:16:36 +0000
committer	YurenHao0426 <blackhao0426@gmail.com>	2026-02-10 20:16:36 +0000
commit	5626080ca4c4219aec4888d6b9406d0d3349fb55 (patch)
tree	86287d9fd5833e11ccd78566992540f2664fd195 /CLAUDE.md
parent	a2036838807428424bbbaff507a6563749a83145 (diff)