summaryrefslogtreecommitdiff
AgeCommit message (Expand)Author
2026-02-22Add 2-player PPO training log (500k episodes, 60.4% vs greedy)HEADmainYurenHao0426
2026-02-22Raise entropy floor to 0.02, increase eval games to 2000haoyuren
2026-02-22Change default eval_every from 10000 to 2500haoyuren
2026-02-22Use auto-calibrated collect_batch in Colab notebookhaoyuren
2026-02-22Add training curve plots to Colab notebookhaoyuren
2026-02-22Add entropy annealing to escape greedy local minimum after warmuphaoyuren
2026-02-22Auto-calibrate collect_batch when not specifiedhaoyuren
2026-02-22Fix total_mem → total_memory in Colab GPU checkhaoyuren
2026-02-22Fix invalid notebook cell schema (markdown with execution_count)haoyuren
2026-02-22Batched game collection for ~7x training speeduphaoyuren
2026-02-22Update README and Colab notebook for current rules and featureshaoyuren
2026-02-22Separate CPU collect / GPU train, add training CSV loghaoyuren
2026-02-22Fix SWAP inheritance, stalemate logic, add greedy warmuphaoyuren
2026-02-22Improve versus UI: suit colors, AI highlighting, draw tellhaoyuren
2026-02-22Update rules: free draw/pass, remove Q in 2-player gameshaoyuren
2026-02-22Add tqdm progress bar, fix Colab usernamehaoyuren
2026-02-22Add Colab GPU training notebookhaoyuren
2026-02-22Initial commit: Blazing Eights RL agenthaoyuren