summaryrefslogtreecommitdiff
path: root/src/training
AgeCommit message (Expand)Author
14 hoursFix NLL double-shift bug and head weight initYurenHao0426
15 hoursFix init state: add logit_bias so A≈1 at init (dense connectivity)YurenHao0426
15 hoursInitial implementation: DAGFormer Phase 1YurenHao0426