|
Complete rewrite with current data (60 profiles × 60 sessions):
- Updated all numbers and removed stale references
- Removed duplicate paragraph
- Added: user vector role analysis (RAG 44.3% → RAG+Vec 26.4% timeout)
- Added: E/T decomposition (79% from enforcements, not negative)
- Added: why Vanilla performs well discussion
- Updated: user-vector geometry (ρ=0.040, dual-vector separation)
- Updated: limitations (keyword reward, no GRPO, 60 profiles)
- Updated: future directions (ablation underway, LLM judge ready)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|