diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 00:47:38 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 00:47:38 -0500 |
| commit | df9f69bc9172b3473be144ff8a17370bc7a68e64 (patch) | |
| tree | d669bf8c1aa90ac7a3357805f88f870045c260eb /results | |
| parent | 5b7f83ae5240c78013c084cf2e24ce5a5f572c42 (diff) | |
MAJOR: penalized DFA deep-layer cosine is +0.17, NOT zero
Direct deep-block credit measurement on penalized DFA s42 checkpoint
(lam=1e-2, 30 epochs, just trained):
per-layer cos(e_T B^T, BP grad) — TRAINING Bs, no eps clamp:
l0: +0.316 (±0.188) ||g||=9.18e-7 ||a||=4.53
l1: +0.169 (±0.087) ||g||=8.87e-7 ||a||=4.57
l2: +0.151 (±0.084) ||g||=8.77e-7 ||a||=4.50
l3: +0.165 (±0.099) ||g||=8.73e-7 ||a||=4.64
l4: +0.166 (±0.098) ||g||=8.69e-7 ||a||=4.64
layer-mean: +0.193
Compare to vanilla DFA (existing measurement, scale-broken regime):
l0: +0.42 l1-4: ~0 (essentially zero)
CRITICAL INTERPRETATION: The penalty doesn't just fix scale, it ALSO
restores deep-layer direction quality from ~0 to ~0.17. This contradicts
the prior 'two failure modes' framing where I assumed direction would
remain broken even after scale fix. The honest story is:
- vanilla DFA: scale catastrophic, BP grad at floor, cosine measurement
DEGENERATE (cos ~0 is noise dominance, not 'no alignment')
- penalized DFA: scale fixed, BP grad healthy, cosine measurement
INTERPRETABLE — and the value is +0.17 on deep layers (partially
aligned, much less than BP's self-cosine of 1.0)
- the +0.17 alignment explains why penalized DFA gets 0.36 (60% of
BP's 0.61) — partial credit gives partial training, not zero training
The 'second failure mode' claim is wrong. There's ONE unified failure
mode (scale + measurement degeneracy), and the penalty rescues BOTH.
The remaining gap to BP is 'partial credit quality', not a separate
failure mode.
Diffstat (limited to 'results')
| -rw-r--r-- | results/dfa_pen_short/dfa_pen_lam0.01_s42.json | 43 |
1 files changed, 43 insertions, 0 deletions
diff --git a/results/dfa_pen_short/dfa_pen_lam0.01_s42.json b/results/dfa_pen_short/dfa_pen_lam0.01_s42.json new file mode 100644 index 0000000..1be1de6 --- /dev/null +++ b/results/dfa_pen_short/dfa_pen_lam0.01_s42.json @@ -0,0 +1,43 @@ +{ + "config": { + "seed": 42, + "epochs": 30, + "lr": 0.001, + "wd": 0.01, + "lam": 0.01, + "output_dir": "results/dfa_pen_short" + }, + "final_test_acc": 0.3593, + "log": [ + { + "epoch": 0, + "h_L_norm": 8.893179893493652, + "g_2_norm": 0.0009934091940522194, + "acc_eval": 0.115234375 + }, + { + "epoch": 1, + "h_L_norm": 788.7547607421875, + "g_2_norm": 1.1413892934797332e-05, + "acc_eval": 0.3359375 + }, + { + "epoch": 10, + "h_L_norm": 7251.46533203125, + "g_2_norm": 2.3037373466650024e-06, + "acc_eval": 0.3662109375 + }, + { + "epoch": 20, + "h_L_norm": 11374.17578125, + "g_2_norm": 1.813692506402731e-06, + "acc_eval": 0.373046875 + }, + { + "epoch": 30, + "h_L_norm": 12158.3349609375, + "g_2_norm": 1.7568806924828095e-06, + "acc_eval": 0.375 + } + ] +}
\ No newline at end of file |
