| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 2026-02-11 | Add revised experimental setup LaTeX section | YurenHao0426 | |
| Key corrections: - 3 datasets (math-hard, math-500, bigcodebench), not math-hard only - 60 profiles × 60 sessions, not 200 profiles × 60 turns - User simulator: Llama-3.3-70B-Instruct (not 3.1) - GPU layout: agent on GPU 2, embed/reranker on GPU 3 - Added reward model description - Fixed incomplete sentence Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> | |||
