diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-07 22:48:18 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-07 22:48:18 -0500 |
| commit | c2e145e162444b31ac5c66a90daa6bc0a1cda591 (patch) | |
| tree | ad38ef6b7295ecdb3de67e4474d89548e9ca80ba /results/ep_synthetic/ep_a0.0_L4_s5000.json | |
| parent | 3a520b203f4f0c75b37b2d5c34d461718729ea02 (diff) | |
Add random-init sanity check: protocol does not flag untrained networks
3-seed random init ResMLP gives chance accuracy (~10%) but the protocol
verdict is 'trustworthy' on all 3 seeds:
- residual norms ~8.7 across all layers (no growth, bounded)
- BP gradient norms ~8e-3 (healthy, well above 1e-7 floor)
- cross-batch stability 0.08-0.18 (in the BP/EP range)
This is the answer to the likely reviewer question: 'is your protocol just
flagging anything that doesn't perform well?' Answer: no. Random init is
at chance and the protocol passes it. The walked-back trained methods are
walked back because of the *measurements*, not because of the accuracy.
Notable: random init g-norms (8e-3) are actually HIGHER than BP-trained
ones (4e-4) — BP training reduces the gradient magnitude as loss decreases.
So the protocol distinguishes 3 distinct regimes: (1) untrained healthy,
(2) trained-and-still-healthy (BP/EP), (3) trained-into-pathology (DFA/SB/CB).
Diffstat (limited to 'results/ep_synthetic/ep_a0.0_L4_s5000.json')
0 files changed, 0 insertions, 0 deletions
