| Age | Commit message (Collapse) | Author |
|
- fig_method_comparison: normalized improvement vs reflection + learning curve
- fig_vector_analysis: vector growth + cumulative head-to-head advantage
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- learning_and_vectors.png: learning curve, vector growth, cumulative advantage, efficiency
- method_comparison_bars.png: success/effort/timeout bar charts
- vector_similarity_60s.png: PCA, pref-vector correlation (r=0.046, p=0.054), heatmap
- vector_similarity_30s.png: same for 30 sessions
- vector_analysis.png: norm distribution + session range bars
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- Efficiency: +8.4% success/token vs reflection
- Late-session performance: 54.1% vs 51.8%
- Head-to-head, quick resolution, zero-enforcement, profile improvement stats
- Comprehensive report story summary
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- E/T difference 79% from slightly more enforcements, 20% from fewer turns
- Neither component individually significant
- rag_vector achieves results in fewer turns with lower user effort
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- Detect agent repetition bugs (7.1% rag_vector, 3.8% reflection)
- After cleanup: timeout rate significantly lower (p=0.046)
- User effort significantly lower (p=0.021)
- Paired t-test and Wilcoxon results with effect sizes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
- RAG rewrite adapter and vector preference pipeline in personalized_llm
- 60-session experiment queue scripts (reflection, rag, rag_vector, rag_rewrite)
- Vector-preference correlation analysis and visualization scripts
- Local reward model batch processing improvements
- Updated CLAUDE.md with full experiment documentation and notes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
|
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
Add Python wrappers for:
- Qwen3/Nemotron embedding models
- BGE/Qwen3 rerankers
- vLLM/Llama/Qwen LLM backends
- GPT-4o/LLM-based preference extractors
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
|
|
|
|
- Add collaborativeagents subproject with adapters, agents, and evaluation modules
- Update .gitignore to exclude large binary files (.whl, .tar), wandb logs, and results
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
|
|