diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-07 23:03:05 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-07 23:03:05 -0500 |
| commit | 4172195ca318387e20e3576ab40187d4d2f08ebe (patch) | |
| tree | e6cb78aba41069e9bf9ec4cd29f5d3a47d2586af /protocol | |
| parent | 31ddecc9eb646b15c4ac5960c7de9346c8f7be68 (diff) | |
Cross-architecture temporal validation: 3 archs x 3 seeds x 2 methods
ResMLP (4-block d=256, with out_ln, CIFAR-10):
s42: DFA (a) ep 8, (b) ep 4, acc 0.308
s123: DFA (a) ep 11, (b) ep 4, acc 0.320
s456: DFA (a) ep 8, (b) ep 3, acc 0.300
ViT-Mini (4-block d=128, cls token + terminal LN, CIFAR-10):
s42: DFA (a) ep 1, (b) ep 3, acc 0.256
s123: DFA (a) ep 1, (b) ep 2, acc 0.202
s456: DFA (a) ep 1, (b) ep 3, acc 0.253
StudentNet (4-block d=128, NO terminal LN, synthetic alpha=1.0):
s42: DFA (a) ep 18, (b) NEVER, acc 0.332
s123: DFA (a) ep 14, (b) NEVER, acc 0.314
s456: DFA (a) ep 25, (b) NEVER, acc 0.336
BP: never fires on any seed x any architecture (9/9 sanity passes).
Key cross-architecture finding: diagnostic (b) is specifically the LN-
driven failure mode. Without out_ln, the BP grad never crosses the 1e-7
floor, even though (a) still fires (the residual stream still grows, just
without the LN-cancellation pathology that drives the BP grad to the
floor). This is the causal architectural control: (b) specifically tests
'is terminal-LN gradient cancellation active?' and (a) tests 'is the
residual stream growing without bound?'. They are linked but separable.
This is the ยง3 cross-architecture validation evidence.
Diffstat (limited to 'protocol')
| -rw-r--r-- | protocol/examples/temporal_diagnostic_evolution.py | 10 |
1 files changed, 8 insertions, 2 deletions
diff --git a/protocol/examples/temporal_diagnostic_evolution.py b/protocol/examples/temporal_diagnostic_evolution.py index 35cf720..af0da84 100644 --- a/protocol/examples/temporal_diagnostic_evolution.py +++ b/protocol/examples/temporal_diagnostic_evolution.py @@ -64,7 +64,7 @@ def main(): import argparse p = argparse.ArgumentParser() p.add_argument("--seed", type=int, default=42) - p.add_argument("--arch", type=str, default="resmlp", choices=["resmlp", "vit"]) + p.add_argument("--arch", type=str, default="resmlp", choices=["resmlp", "vit", "no_outln"]) args = p.parse_args() if args.arch == "resmlp": snapshot_path = os.path.join( @@ -72,12 +72,18 @@ def main(): ) h_key = "hidden_norms" g_key = "bp_grad_norms_per_sample_med" - else: + elif args.arch == "vit": snapshot_path = os.path.join( REPO_ROOT, f"results/snapshot_vit_v1/snapshot_vit_s{args.seed}.json" ) h_key = "hidden_norms_cls" g_key = "bp_grad_per_sample_l2_med" + else: # no_outln (synthetic studentnet without terminal LN) + snapshot_path = os.path.join( + REPO_ROOT, f"results/snapshot_no_outln_v1/snapshot_noLN_s{args.seed}.json" + ) + h_key = "hidden_norms" + g_key = "bp_grad_per_sample_l2_med" if not os.path.exists(snapshot_path): print(f"snapshot not found: {snapshot_path}") return |
