summaryrefslogtreecommitdiff
path: root/protocol/__init__.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 20:10:14 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 20:10:14 -0500
commita18765a553ca454de49fc6462e231f05367ce580 (patch)
tree681ee0ca2e8e0079516bcb7ceb5a51a70c9c8eab /protocol/__init__.py
parent0dc3831b588bfac613df47e56e633c8c0597497b (diff)
paper v2.34.3: fix Figure 4 cross-arch verdict matrix (data + layout)
User flagged Figure 4 issues. Found three problems: 1. **Row 4 (no-terminal-LN ResMLP) (d) frozen** was encoded as 0 (passes) but the actual data is no-outln DFA acc 0.327 ± 0.012 (3-seed) vs frozen baseline 0.349 ± 0.002 → margin -2.2 pp, beyond the 2 pp threshold → (d) FIRES. Updated to 1 (WB). 2. **Row 5 (CNN BN) cells (c) and (d)** were encoded as 0 (passes) but the CNN audit (results/protocol_audit/audit_cnn_3seed.json) only measured (a) and (b); there is no CNN frozen baseline and no CNN stability run. Showing them as ✓ was misleading. Added a third color (gray, "—") for "not measured" and marked CNN (c)+(d) accordingly. 3. **Layout** had massive empty vertical space below the panels with the key-finding text floating far below. Compressed: - figsize (11, 4.2) → (11, 3.2) [tighter aspect ratio] - Key-finding text moved from axes-coordinates y=-1.55 (way below plot) to figure-coordinates y=-0.05 (directly under panels) - BP panel title clarified: "BP-trained: protocol passes" → "BP-trained: protocol passes everywhere" Also marked ViT-Mini (c) and no-LN ResMLP (c) as "not measured" since neither has a saved cross_batch_stability value (the audit_cnn, audit_d512, snapshot_vit_v1, and snapshot_no_outln_v1 files don't include this diagnostic). New verdict matrix: (a) (b) (c) (d) ResMLP-d256 LN WB WB ✓ WB ResMLP-d512 LN WB WB ✓ WB ViT-Mini WB WB — WB ResMLP-d256 no-LN WB ✓ — WB ← row 4 (d) was wrong CNN BN WB ✓ — — ← row 5 (c)+(d) were misleading Key finding "(b) only fires on terminal-LN architectures" is unchanged and now visually clearer (rows 1-3 have WB in (b), rows 4-5 have ✓). Page impact: total page count 19 → 18 (the more compact figure reclaimed an entire page). §1-§7 main content still fits on 9 pages. Updated docstring with full data sources for each row. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'protocol/__init__.py')
0 files changed, 0 insertions, 0 deletions