summaryrefslogtreecommitdiff
path: root/paper/figures/render_fig_cos_acc_dissociation.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 20:17:43 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 20:17:43 -0500
commitd1c22697a99c894f07db972acb5a1a9229b0276a (patch)
treeb1725645a071c5b7bb2f9dd2c7830df605b9aad7 /paper/figures/render_fig_cos_acc_dissociation.py
parenta18765a553ca454de49fc6462e231f05367ce580 (diff)
paper v2.35: add Figure 2 - cross-method cos-vs-accuracy dissociation
User said "you don't need to worry about page count for now", which freed up the page budget for substantive additions. Highest-yield substantive addition: a visual figure for the §4 ¶4 cross-method dissociation that the user previously flagged as the paper's strongest new observation but is currently text-only. New figure: paper/figures/fig_cos_acc_dissociation.pdf - Parallel-coordinates / slope-chart style - 4 columns: deep cos | accuracy | |nudging| | training-loss decrease - 3 lines: SB+pen (blue), CB+pen (red), DFA+pen (gray) - Each metric normalized to [0, 1] with raw values annotated - Shaded "cos: CB top" region on the left vs labeled "accuracy / nudging / training-loss: SB top" on the right - The X-pattern between cos and accuracy makes the dissociation visually immediate: SB rises from middle (cos) to top (functional), CB falls from top (cos) to tied with DFA (functional) Inserted between §4 ¶4 (Mode 2 mechanism) and §5 (intervention). Referenced from the §4 ¶4 functional measurements paragraph as "Figure 2". Why this figure replaces the prose-only argument's burden of proof: the X-pattern visualization is a single glance vs paragraph parsing. Reviewers will see "deep cosine ranks differently from 3 functional metrics" without needing to track the numbers. Important design choice: did NOT include deep ρ in the figure, even though it's in §4 ¶2, because ρ ranks CB > SB > DFA (same as cos), not the SB > CB > DFA pattern of the functional metrics. ρ groups with cos as a "directional alignment" metric, while the functional triad (accuracy, nudging, training-loss) groups around forward-state usefulness. The figure caption notes this distinction implicitly by listing only the three functional metrics. Page impact: total 18 → 19 pages, main content §1-§7 now spans p1-p10 (was p1-p9). Per user's relaxed constraint, page count is no longer the binding constraint. Figure auto-shifts the figure numbering: cos_acc_dissoc is now Figure 2, temporal_cross_arch becomes Figure 3, penalty_rescue → Figure 4, cross_arch_summary → Figure 5. All figure references use \\ref{} so they auto-update. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'paper/figures/render_fig_cos_acc_dissociation.py')
-rw-r--r--paper/figures/render_fig_cos_acc_dissociation.py93
1 files changed, 93 insertions, 0 deletions
diff --git a/paper/figures/render_fig_cos_acc_dissociation.py b/paper/figures/render_fig_cos_acc_dissociation.py
new file mode 100644
index 0000000..fff7f65
--- /dev/null
+++ b/paper/figures/render_fig_cos_acc_dissociation.py
@@ -0,0 +1,93 @@
+"""Render Figure: cos-vs-accuracy cross-method dissociation.
+
+Shows the v2.33 finding: under matched penalty rescue (lam=1e-2, 30ep, 3 seeds)
+on the audited 4-block d=256 ResMLP, three independent functional metrics
+(headline accuracy, single-step nudging, integrated training-loss decrease)
+all rank SB ≫ CB ≈ DFA, while deep cosine ranks CB > SB > DFA — the only
+ordering that disagrees with the functional ranking.
+
+Sources (all 3-seed):
+ results/round38_sb_penalty_30ep_s{42,123,456}/results_cifar10.json
+ results/round38_cb_penalty_30ep_s{42,123,456}/results_cifar10.json
+ results/round41_dfa_penalty_30ep{,_s{123,456}}/results_cifar10.json
+ results/nudging_test_3seed_summary.json
+ results/training_loss_decrease_3seed.json
+"""
+import os
+import matplotlib
+matplotlib.use("Agg")
+import matplotlib.pyplot as plt
+import numpy as np
+
+REPO_ROOT = "/home/yurenh2/fa"
+
+# Three-seed values from the saved JSONs (cross-checked against §4 ¶4 prose)
+methods = ["SB+pen", "CB+pen", "DFA+pen"]
+colors = {"SB+pen": "#1f77b4", "CB+pen": "#d62728", "DFA+pen": "#7f7f7f"}
+
+# Each entry: (raw values per method, with std if available)
+# §4 ¶4 lists the three functional metrics as accuracy, nudging, training-loss
+# trajectory. Deep ρ is intentionally NOT shown here because ρ ranks CB > SB > DFA
+# (same as cos), not SB > CB > DFA — ρ groups with cos as a "directional alignment"
+# metric, while the functional triad below groups around forward-state usefulness.
+metrics = {
+ "deep cos": [0.322, 0.679, 0.151],
+ "accuracy": [0.453, 0.360, 0.360],
+ "|nudging|": [1.929e-3, 4.264e-4, 4.978e-5],
+ "loss decrease": [0.447, 0.121, 0.095],
+}
+metric_stds = {
+ "deep cos": [0.007, 0.008, 0.025],
+ "accuracy": [0.003, 0.003, 0.001],
+ "|nudging|": [0.113e-3, 0.024e-3, 0.0044e-3],
+ "loss decrease": [0.008, 0.003, 0.007],
+}
+
+# Normalize each metric to [0, 1] where 1 = max across the 3 methods.
+# This makes the parallel-coordinates lines comparable.
+metric_names = list(metrics.keys())
+norm = {}
+for m, vals in metrics.items():
+ mx = max(vals)
+ norm[m] = [v / mx for v in vals]
+
+fig, ax = plt.subplots(figsize=(6.0, 3.4))
+
+x = np.arange(len(metric_names))
+
+for i, method in enumerate(methods):
+ y = [norm[m][i] for m in metric_names]
+ ax.plot(x, y, "o-", color=colors[method], lw=2.2, markersize=9, label=method)
+ # Annotate each point with the raw value
+ for xi, yi, m in zip(x, y, metric_names):
+ raw = metrics[m][i]
+ if "nudg" in m:
+ label = f"{raw*1e3:.2f}e-3"
+ elif "cos" in m:
+ label = f"+{raw:.3f}" if raw >= 0 else f"{raw:.3f}"
+ else:
+ label = f"{raw:.3f}"
+ # Place label slightly offset based on method ordering at this x
+ ax.annotate(label, (xi, yi), textcoords="offset points",
+ xytext=(8, 0), fontsize=7, color=colors[method],
+ ha="left", va="center")
+
+ax.set_xticks(x)
+ax.set_xticklabels(metric_names, fontsize=9)
+ax.set_ylabel("normalized score (max = 1 across the 3 methods)", fontsize=9)
+ax.set_ylim(-0.05, 1.18)
+ax.set_title("Cross-method functional dissociation (3 seeds, 30 ep, $\\lambda{=}10^{-2}$)\n"
+ "all 3 functional metrics rank SB $\\gg$ CB $\\approx$ DFA; deep cos is the only one that disagrees",
+ fontsize=9)
+ax.legend(loc="upper right", fontsize=8, framealpha=0.95)
+ax.grid(True, axis="y", alpha=0.3)
+
+# Visual guide: shade the "cos column disagrees" region
+ax.axvspan(-0.4, 0.4, color="#fff3e0", alpha=0.5, zorder=0)
+ax.text(0, 1.13, "cos: CB top", ha="center", fontsize=7, color="#cc4400", style="italic")
+ax.text(2.5, 1.13, "accuracy / nudging / training-loss decrease: SB top", ha="center", fontsize=7, color="#1f5f9f", style="italic")
+
+plt.tight_layout()
+out = os.path.join(REPO_ROOT, "paper/figures/fig_cos_acc_dissociation.pdf")
+plt.savefig(out, bbox_inches="tight", dpi=200)
+print(f"Saved {out}")