summaryrefslogtreecommitdiff
path: root/experiments/gelu_ablation.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:32:23 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:32:23 -0500
commit0c1d102c57d86d914eb1122dd59f329667db60d8 (patch)
tree04d8e676817fa9f243d466686efdfcb2883bff99 /experiments/gelu_ablation.py
parent2b4581723d0c5ed562528fac6b0a789adf95e3c5 (diff)
paper v2.31.9: relabel "StudentNet" → "no-terminal-LN ResMLP"
The §3 ¶3 / §5 ¶3 / Figure 5 / §7 mentions of "StudentNet" as a cross-architecture validation case were a misleading rebrand of the no-terminal-LN ResMLP-d256 ablation. Verified by tracing the data: results/protocol_audit/temporal_evolution_s{42,123,456}.json final_acc 0.332/0.313/0.336 (matches no-outln 3-seed 0.327±0.012) first_fire_a {18, 14, 25} first_fire_b None / None / None The actual synth StudentNet (results/snapshot_synth_v1, d=128 alpha=1.0) has max-per-block growth ~6.88 over 80 epochs and never reaches the 50× threshold, so diagnostic (a) does NOT fire on the real synth StudentNet at all. Calling the no-outln data "StudentNet" doubled- counted the same architecture under two names (the same-backbone causal control AND the cross-arch generalization test). Relabeled to "no-terminal-LN ResMLP" everywhere it appeared: - §3 ¶3 paragraph 1 cross-arch list - §3 ¶3 paragraph 2 (now with explicit per-seed first-fire epochs {18,14,25}) - §5 paragraph (the conclusion) - §7 conclusion (cross-arch list) - Figure 5 caption - Figure 5 row label (with re-rendered PDF) The remaining cross-arch generalization claim is now: ViT-Mini fires both diagnostics, ResMLP at d=256/d=512 fires both, no-terminal-LN ResMLP and BatchNorm CNN fire only (a) — three real architecture classes, with the no-LN ablation being the same-backbone control rather than a separate architecture. The cross-arch story is slightly weaker ("3 architecture classes" not "4") but truthful and self-consistent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'experiments/gelu_ablation.py')
0 files changed, 0 insertions, 0 deletions