|
Trains vanilla DFA (no penalty) for max_epoch epochs and saves checkpoints
+ Bs at specified early epochs (default: 1, 2, 3, 4, 5). Logs per-layer
||h_l|| and ||g_l|| at each epoch so we can see when ||g_L|| crosses the
1e-7 floor.
Codex round 19's #3 critical experiment for disambiguating:
Hypothesis A: deep alignment was always there in vanilla DFA but hidden
by the post-collapse measurement degeneracy
Hypothesis B: deep alignment was created by the penalty intervention
Test: measure deep-layer cos at vanilla checkpoints from ep 1-3 (when
||g_L|| should still be in the meaningful regime).
If cos > 0 at ep 1-2 vanilla -> hypothesis A
If cos ~ 0 at ep 1-2 vanilla -> hypothesis B
|