| Age | Commit message (Collapse) | Author |
|
The §3 ¶3 / §5 ¶3 / Figure 5 / §7 mentions of "StudentNet" as a
cross-architecture validation case were a misleading rebrand of the
no-terminal-LN ResMLP-d256 ablation. Verified by tracing the data:
results/protocol_audit/temporal_evolution_s{42,123,456}.json
final_acc 0.332/0.313/0.336 (matches no-outln 3-seed 0.327±0.012)
first_fire_a {18, 14, 25}
first_fire_b None / None / None
The actual synth StudentNet (results/snapshot_synth_v1, d=128 alpha=1.0)
has max-per-block growth ~6.88 over 80 epochs and never reaches the
50× threshold, so diagnostic (a) does NOT fire on the real synth
StudentNet at all. Calling the no-outln data "StudentNet" doubled-
counted the same architecture under two names (the same-backbone
causal control AND the cross-arch generalization test).
Relabeled to "no-terminal-LN ResMLP" everywhere it appeared:
- §3 ¶3 paragraph 1 cross-arch list
- §3 ¶3 paragraph 2 (now with explicit per-seed first-fire epochs {18,14,25})
- §5 paragraph (the conclusion)
- §7 conclusion (cross-arch list)
- Figure 5 caption
- Figure 5 row label (with re-rendered PDF)
The remaining cross-arch generalization claim is now: ViT-Mini fires
both diagnostics, ResMLP at d=256/d=512 fires both, no-terminal-LN
ResMLP and BatchNorm CNN fire only (a) — three real architecture
classes, with the no-LN ablation being the same-backbone control rather
than a separate architecture. The cross-arch story is slightly weaker
("3 architecture classes" not "4") but truthful and self-consistent.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
label overlap (fig4)
Per user feedback:
- fig4_penalty_rescue.pdf (Figure 3 in paper): was figsize=(13, 3.5), aspect 3.7:1,
which rendered as a thin strip with squeezed subplot content. Increased height
to figsize=(13, 6.0), aspect 2.2:1. Much taller panels that actually show axis
labels and legends readably.
- fig5_cross_arch_summary.pdf (Figure 4 in paper): the 'Key finding' italic text
annotation at y=-1.0 in axes transform was overlapping with the multiline
architecture y-tick labels at the bottom of the second subplot. Moved to
y=-1.55 and increased figsize height from 3.5 to 4.2 so the lower annotation
still fits in bbox_inches='tight' crop.
- Also bumped includegraphics width from 0.92\linewidth to \linewidth for both
figures so they use the full text width.
Main content still exactly 9 pages within E&D budget.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
Tables filled with real values:
Table 1: 5-method audit (3-seed mean ± std for acc, headline Γ, verdict)
Table 2: 4-condition mode 2 validation (cos and ρ values from existing
checkpoint measurements)
Table 3: protocol thresholds (50×, 1e-7, 0.30, 2pp)
Figures generated from existing data:
fig2_decision_utility.pdf: 5×7 verdict heatmap from
results/protocol_audit/ablation_decision_utility.json
fig4_penalty_rescue.pdf: 3-panel — trajectory + cos/ρ bars + 2×2 acc
from snapshot_evolution_v2 + dfa_residual_penalty + bp_with_penalty
fig5_cross_arch_summary.pdf: 5×4 BP/DFA verdict matrix across
architectures
Compiles to 8 pages with all tables/figures rendered. §1-§7 main body
still has only paragraph topic sentences (TODO: per-section prose
filling via codex). Figure numbering is wrong (codex put figures in
section order not numerical order — need fixing).
|