| Age | Commit message (Expand) | Author |
|---|---|---|
| 2026-02-11 | Add query transformation, global preferences, and hyperparameter table | YurenHao0426 |
| 2026-02-11 | Fix z_long definition to match code (zero-init + REINFORCE, not mean) | YurenHao0426 |
| 2026-02-11 | Rewrite reward section to describe keyword heuristic (matches experiments) | YurenHao0426 |
| 2026-02-11 | Add revised abstract LaTeX | YurenHao0426 |
| 2026-02-11 | Add revised introduction LaTeX section | YurenHao0426 |
| 2026-02-11 | Add revised conclusion LaTeX section | YurenHao0426 |
| 2026-02-11 | Add preference format compliance paragraph to discussion | YurenHao0426 |
| 2026-02-11 | Add revised discussion & limitations LaTeX section | YurenHao0426 |
| 2026-02-11 | Add revised results LaTeX section with actual data | YurenHao0426 |
| 2026-02-11 | Add revised experimental setup LaTeX section | YurenHao0426 |
| 2026-02-11 | Add revised reward modeling LaTeX section matching code implementation | YurenHao0426 |
| 2026-02-10 | Add RAG rewrite, 60-session experiment scripts, and analysis tools | YurenHao0426 |
