summaryrefslogtreecommitdiff
path: root/experiments/online_schedule_timing.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:00:54 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:00:54 -0500
commit31ddecc9eb646b15c4ac5960c7de9346c8f7be68 (patch)
treeeb3d7784aa24dbcd0aca348c0239df609ba3fbf5 /experiments/online_schedule_timing.py
parentede7cca3e4f9048e3fc6d99077f8842e9b598ff4 (diff)
Protocol diagnostic (a): use max per-block growth, not max/min ratio
Old metric: max(||h||) / max(||h_0||, eps). False-positives on ViT-style architectures because the cls token at layer 0 (right after patch_embed) has anomalously small magnitude (~0.3-1.5), inflating the ratio even on healthy BP-trained ViTs. New metric: max_l(||h_{l+1}|| / ||h_l||) — the largest single-block residual amplification. Architecture-invariant. Calibration: - BP-trained, late training: <5x per block - BP ViT, early epochs (cls token resolving): 13-25x max - DFA-trained ResMLP/ViT: 100-4000x per block Threshold raised from 10 to 50 to sit cleanly between healthy-early- training (max 25) and failure-regime (min 100). Re-verifications: - smoke test (BP/DFA/EP): all 3 verdicts unchanged - random init (3 seeds): trustworthy on all 3 - 5-method audit table single-seed: identical verdicts - decision-utility ablation: identical (still 0/5 by S1, 3/5 by S_full) - temporal evolution 3-seed: (b) now fires first at ep 3-4, (a) at ep 8-11. Both well before training ends. The 'protocol fires ~92 epochs early' story still holds. - ViT temporal evolution: BP no longer false-fires; DFA fires (a) ep 1, (b) ep 3 — protocol works on the second architecture.
Diffstat (limited to 'experiments/online_schedule_timing.py')
0 files changed, 0 insertions, 0 deletions