summaryrefslogtreecommitdiff
path: root/experiments/bp_support_sparsity.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:03:05 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:03:05 -0500
commit4172195ca318387e20e3576ab40187d4d2f08ebe (patch)
treee6cb78aba41069e9bf9ec4cd29f5d3a47d2586af /experiments/bp_support_sparsity.py
parent31ddecc9eb646b15c4ac5960c7de9346c8f7be68 (diff)
Cross-architecture temporal validation: 3 archs x 3 seeds x 2 methods
ResMLP (4-block d=256, with out_ln, CIFAR-10): s42: DFA (a) ep 8, (b) ep 4, acc 0.308 s123: DFA (a) ep 11, (b) ep 4, acc 0.320 s456: DFA (a) ep 8, (b) ep 3, acc 0.300 ViT-Mini (4-block d=128, cls token + terminal LN, CIFAR-10): s42: DFA (a) ep 1, (b) ep 3, acc 0.256 s123: DFA (a) ep 1, (b) ep 2, acc 0.202 s456: DFA (a) ep 1, (b) ep 3, acc 0.253 StudentNet (4-block d=128, NO terminal LN, synthetic alpha=1.0): s42: DFA (a) ep 18, (b) NEVER, acc 0.332 s123: DFA (a) ep 14, (b) NEVER, acc 0.314 s456: DFA (a) ep 25, (b) NEVER, acc 0.336 BP: never fires on any seed x any architecture (9/9 sanity passes). Key cross-architecture finding: diagnostic (b) is specifically the LN- driven failure mode. Without out_ln, the BP grad never crosses the 1e-7 floor, even though (a) still fires (the residual stream still grows, just without the LN-cancellation pathology that drives the BP grad to the floor). This is the causal architectural control: (b) specifically tests 'is terminal-LN gradient cancellation active?' and (a) tests 'is the residual stream growing without bound?'. They are linked but separable. This is the ยง3 cross-architecture validation evidence.
Diffstat (limited to 'experiments/bp_support_sparsity.py')
0 files changed, 0 insertions, 0 deletions