diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-02-10 20:16:36 +0000 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-02-10 20:16:36 +0000 |
| commit | 5626080ca4c4219aec4888d6b9406d0d3349fb55 (patch) | |
| tree | 86287d9fd5833e11ccd78566992540f2664fd195 /CLAUDE.md | |
| parent | a2036838807428424bbbaff507a6563749a83145 (diff) | |
Add RAG rewrite, 60-session experiment scripts, and analysis tools
- RAG rewrite adapter and vector preference pipeline in personalized_llm
- 60-session experiment queue scripts (reflection, rag, rag_vector, rag_rewrite)
- Vector-preference correlation analysis and visualization scripts
- Local reward model batch processing improvements
- Updated CLAUDE.md with full experiment documentation and notes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to 'CLAUDE.md')
| -rw-r--r-- | CLAUDE.md | 36 |
1 files changed, 36 insertions, 0 deletions
@@ -349,3 +349,39 @@ pkill -f "vllm.entrypoints" For questions about this codebase, refer to the experiment plan at: `/u/yurenh2/.claude/plans/effervescent-mapping-ocean.md` + +--- + +## Future Improvements (To Try) + +### Retrieval Quality Improvements + +**Problem Identified**: Current retrieval uses raw user message as query (e.g., "shortest palindrome"), but this doesn't match well with preference descriptions (e.g., "break down code with explanations"). Reranker matches task content, not preference applicability. + +**Proposed Solutions**: + +#### 1. Query Transformation +Instead of using raw user message as retrieval query, construct preference-oriented queries: +- Option A: Use LLM to generate "what user preferences might apply to this task?" +- Option B: Append task-type keywords to query (e.g., "code explanation preferences for: shortest palindrome") +- Option C: Multi-query retrieval - one for task content, one for task type/category + +#### 4. Global vs Conditional Preferences +Separate preferences into two tiers: +- **Global preferences**: High-frequency, always-applicable (e.g., "always use numbered steps", "use Python for code") + - Always include in context, no retrieval needed + - Identify via frequency analysis or explicit "When general" condition +- **Conditional preferences**: Context-specific (e.g., "when debugging, focus on specific issue") + - Only these need retrieval based on task context + - Reduces retrieval burden and ensures universal preferences never missed + +**Implementation Notes**: +- Can be tested as ablation after current experiments complete +- Evaluate by: enforcement rate reduction, retrieval recall of actually-enforced preferences + +--- + +## RAG Improvement Ideas + +See [docs/rag_improvement_ideas.md](docs/rag_improvement_ideas.md) for detailed brainstorming on how to improve RAG retrieval quality and reduce timeout rate. + |
