From 5626080ca4c4219aec4888d6b9406d0d3349fb55 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Tue, 10 Feb 2026 20:16:36 +0000 Subject: Add RAG rewrite, 60-session experiment scripts, and analysis tools - RAG rewrite adapter and vector preference pipeline in personalized_llm - 60-session experiment queue scripts (reflection, rag, rag_vector, rag_rewrite) - Vector-preference correlation analysis and visualization scripts - Local reward model batch processing improvements - Updated CLAUDE.md with full experiment documentation and notes Co-Authored-By: Claude Opus 4.6 --- CLAUDE.md | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) (limited to 'CLAUDE.md') diff --git a/CLAUDE.md b/CLAUDE.md index b7b4ccd..34394d1 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -349,3 +349,39 @@ pkill -f "vllm.entrypoints" For questions about this codebase, refer to the experiment plan at: `/u/yurenh2/.claude/plans/effervescent-mapping-ocean.md` + +--- + +## Future Improvements (To Try) + +### Retrieval Quality Improvements + +**Problem Identified**: Current retrieval uses raw user message as query (e.g., "shortest palindrome"), but this doesn't match well with preference descriptions (e.g., "break down code with explanations"). Reranker matches task content, not preference applicability. + +**Proposed Solutions**: + +#### 1. Query Transformation +Instead of using raw user message as retrieval query, construct preference-oriented queries: +- Option A: Use LLM to generate "what user preferences might apply to this task?" +- Option B: Append task-type keywords to query (e.g., "code explanation preferences for: shortest palindrome") +- Option C: Multi-query retrieval - one for task content, one for task type/category + +#### 4. Global vs Conditional Preferences +Separate preferences into two tiers: +- **Global preferences**: High-frequency, always-applicable (e.g., "always use numbered steps", "use Python for code") + - Always include in context, no retrieval needed + - Identify via frequency analysis or explicit "When general" condition +- **Conditional preferences**: Context-specific (e.g., "when debugging, focus on specific issue") + - Only these need retrieval based on task context + - Reduces retrieval burden and ensures universal preferences never missed + +**Implementation Notes**: +- Can be tested as ablation after current experiments complete +- Evaluate by: enforcement rate reduction, retrieval recall of actually-enforced preferences + +--- + +## RAG Improvement Ideas + +See [docs/rag_improvement_ideas.md](docs/rag_improvement_ideas.md) for detailed brainstorming on how to improve RAG retrieval quality and reduce timeout rate. + -- cgit v1.2.3