summaryrefslogtreecommitdiff
path: root/.gitignore
diff options
context:
space:
mode:
authorhaoyuren <13851610112@163.com>2026-02-22 11:28:45 -0600
committerhaoyuren <13851610112@163.com>2026-02-22 11:28:45 -0600
commit3887054e02e622ca2cb7878bc0dec63d28c7f223 (patch)
tree1a341f7562abb41cfc25badde73879a4e914b1ee /.gitignore
parent1cb5eb34ead9b4efc1032ec74c6ccc439f007c18 (diff)
Fix SWAP inheritance, stalemate logic, add greedy warmup
- SWAP now inherits previous card's suit/rank for matching - Observation encodes effective top card when SWAP is on top - Fix stalemate: only hard passes (can't draw) count, draw+pass resets - Add behavioral cloning warmup: pre-train on greedy policy before PPO - 2p win rate vs greedy random: 60.5% Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to '.gitignore')
0 files changed, 0 insertions, 0 deletions