summaryrefslogtreecommitdiff
path: root/paper/figures/render_fig5_cross_arch.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:32:23 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:32:23 -0500
commit0c1d102c57d86d914eb1122dd59f329667db60d8 (patch)
tree04d8e676817fa9f243d466686efdfcb2883bff99 /paper/figures/render_fig5_cross_arch.py
parent2b4581723d0c5ed562528fac6b0a789adf95e3c5 (diff)
paper v2.31.9: relabel "StudentNet" → "no-terminal-LN ResMLP"
The §3 ¶3 / §5 ¶3 / Figure 5 / §7 mentions of "StudentNet" as a cross-architecture validation case were a misleading rebrand of the no-terminal-LN ResMLP-d256 ablation. Verified by tracing the data: results/protocol_audit/temporal_evolution_s{42,123,456}.json final_acc 0.332/0.313/0.336 (matches no-outln 3-seed 0.327±0.012) first_fire_a {18, 14, 25} first_fire_b None / None / None The actual synth StudentNet (results/snapshot_synth_v1, d=128 alpha=1.0) has max-per-block growth ~6.88 over 80 epochs and never reaches the 50× threshold, so diagnostic (a) does NOT fire on the real synth StudentNet at all. Calling the no-outln data "StudentNet" doubled- counted the same architecture under two names (the same-backbone causal control AND the cross-arch generalization test). Relabeled to "no-terminal-LN ResMLP" everywhere it appeared: - §3 ¶3 paragraph 1 cross-arch list - §3 ¶3 paragraph 2 (now with explicit per-seed first-fire epochs {18,14,25}) - §5 paragraph (the conclusion) - §7 conclusion (cross-arch list) - Figure 5 caption - Figure 5 row label (with re-rendered PDF) The remaining cross-arch generalization claim is now: ViT-Mini fires both diagnostics, ResMLP at d=256/d=512 fires both, no-terminal-LN ResMLP and BatchNorm CNN fire only (a) — three real architecture classes, with the no-LN ablation being the same-backbone control rather than a separate architecture. The cross-arch story is slightly weaker ("3 architecture classes" not "4") but truthful and self-consistent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'paper/figures/render_fig5_cross_arch.py')
-rw-r--r--paper/figures/render_fig5_cross_arch.py6
1 files changed, 3 insertions, 3 deletions
diff --git a/paper/figures/render_fig5_cross_arch.py b/paper/figures/render_fig5_cross_arch.py
index 9ad9ce2..9d52e09 100644
--- a/paper/figures/render_fig5_cross_arch.py
+++ b/paper/figures/render_fig5_cross_arch.py
@@ -10,8 +10,8 @@ REPO_ROOT = "/home/yurenh2/fa"
# Verdict matrix: arch x diagnostic
# 0 = ok (BP), 1 = ok-non-LN-arch, 2 = walk-back
# Columns: (a) per-block growth, (b) ||g_L|| floor, (c) drift stability, (d) frozen baseline
-# Rows: ResMLP-d256, ResMLP-d512, ViT-Mini, StudentNet (no LN), CNN (BN, no LN)
-arches = ["ResMLP $d{=}256$\n(terminal LN)", "ResMLP $d{=}512$\n(terminal LN)", "ViT-Mini\n(cls + LN)", "StudentNet\n(no terminal LN)", "CNN BatchNorm\n(no terminal LN)"]
+# Rows: ResMLP-d256, ResMLP-d512, ViT-Mini, no-terminal-LN ResMLP-d256, CNN (BN, no LN)
+arches = ["ResMLP $d{=}256$\n(terminal LN)", "ResMLP $d{=}512$\n(terminal LN)", "ViT-Mini\n(cls + LN)", "ResMLP $d{=}256$\n(no terminal LN)", "CNN BatchNorm\n(no terminal LN)"]
diags = ["(a) scale", "(b) ${\\|g\\|}$ floor", "(c) drift", "(d) frozen"]
# DFA verdicts on each
@@ -20,7 +20,7 @@ dfa = np.array([
[1, 1, 0, 1], # ResMLP d256: (a) fires, (b) fires, (c) noise sub-mode, (d) fires
[1, 1, 0, 1], # ResMLP d512: same pattern
[1, 1, 0, 1], # ViT-Mini: same pattern
- [1, 0, 0, 0], # StudentNet: only (a) fires; (b) NEVER
+ [1, 0, 0, 0], # ResMLP no-LN: only (a) fires; (b) NEVER
[1, 0, 0, 0], # CNN BN: only (a) fires; (b) NEVER (the killer (b)-is-LN-specific finding)
])