| Branch | Commit message | Author | Age | |
|---|---|---|---|---|
| main | Add 2-player PPO training log (500k episodes, 60.4% vs greedy) | YurenHao0426 | 101 min. | |
| Age | Commit message | Author | ||
| 101 min. | Add 2-player PPO training log (500k episodes, 60.4% vs greedy)HEADmain | YurenHao0426 | ||
| 4 hours | Raise entropy floor to 0.02, increase eval games to 2000 | haoyuren | ||
| 5 hours | Change default eval_every from 10000 to 2500 | haoyuren | ||
| 5 hours | Use auto-calibrated collect_batch in Colab notebook | haoyuren | ||
| 5 hours | Add training curve plots to Colab notebook | haoyuren | ||
| 5 hours | Add entropy annealing to escape greedy local minimum after warmup | haoyuren | ||
| 5 hours | Auto-calibrate collect_batch when not specified | haoyuren | ||
| 5 hours | Fix total_mem → total_memory in Colab GPU check | haoyuren | ||
| 5 hours | Fix invalid notebook cell schema (markdown with execution_count) | haoyuren | ||
| 5 hours | Batched game collection for ~7x training speedup | haoyuren | ||
| [...] | ||||
