summaryrefslogtreecommitdiff
path: root/paper/figures/render_fig4_penalty_rescue.py
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:03:16 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 18:03:16 -0500
commit752dfb833b06a6fb974df892de560caf328ed1dd (patch)
tree4b29ea7737d7066c9ca33675f7061b89da2c55f2 /paper/figures/render_fig4_penalty_rescue.py
parent4aa9d89a28ad9d5a9cfe1ef685a5e100a648b4ed (diff)
paper v2.31: matched 30-epoch BP/DFA controls (was unsourced 0.609/0.308)
The §5 ¶3 BP-no-penalty value of 0.609 ± 0.004 and DFA-no-penalty value of 0.308 ± 0.014 turned out to be unsourced — they were carried over from a hardcoded comment in experiments/bp_with_penalty_control.py ("BP-trainable (3-seed mean): 0.609") that nobody had actually measured with a matched 30-epoch run. Ran the missing matched controls under the same recipe as BP+pen (lam=0, 30 epochs, AdamW 1e-3, wd 0.01, cosine schedule, batch 128, 3 seeds 42/123/456): BP no-pen 30ep: per-seed 0.5851, 0.5845, 0.5863 → 0.585 ± 0.001 (paper said 0.609 ± 0.004, off by 0.024) DFA no-pen 30ep: per-seed 0.3070, 0.2985, 0.2966 → 0.301 ± 0.005 (paper said 0.308 ± 0.014) Also re-grounded DFA+penalty 30ep using the dfa_pen_short 3-seed run (0.3593, 0.3610, 0.3604 → 0.360 ± 0.001), which is what the deep-cosine +0.155 figure was computed on. The paper had 0.363 ± 0.001 — that came from the 100-epoch run, not the 30-epoch run, so it was an apples-to- oranges comparison with BP+pen 30-ep. Paper changes (§5 ¶3): BP penalty cost: -8 pp → -5.5 pp DFA pen rescue: +5.5 → +5.9 pp DFA+pen margin vs frozen: +1.4 → +1.1 pp BP-to-DFA gap: 17 → 17.0 pp (unchanged) BP-to-SB gap: 7.7 → 7.7 pp (unchanged) BP-to-DFA gap is still the lower-bound credit-quality cost claim; 17 pp gap is unchanged in magnitude. Also updated: - §5 ¶1 prose: 0.363 → 0.360, 0.308 → 0.301 - §4 ¶4 prose: DFA+pen 0.363 → 0.360 - Appendix J Table 9 caption: 0.363 → 0.360, +9.0 → +9.3 pp gap to SB - Appendix L paragraph: +5.5 → +5.9 pp DFA penalty rescue - Figure 3 panel C bar values + title pen-cost annotation - New results/matched_30ep_control_summary.json as auditable record Page layout preserved: 9 main pages + refs p10, 18 total, 0 overfull. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'paper/figures/render_fig4_penalty_rescue.py')
-rw-r--r--paper/figures/render_fig4_penalty_rescue.py6
1 files changed, 3 insertions, 3 deletions
diff --git a/paper/figures/render_fig4_penalty_rescue.py b/paper/figures/render_fig4_penalty_rescue.py
index ad21a12..614698c 100644
--- a/paper/figures/render_fig4_penalty_rescue.py
+++ b/paper/figures/render_fig4_penalty_rescue.py
@@ -32,8 +32,8 @@ rho_err = [0.005, 0.0, 0.011, 0.0, 0.0]
# Panel C: 2x2 capacity-cost control
methods = ["BP", "DFA"]
-no_pen = [0.609, 0.308]
-with_pen = [0.530, 0.363]
+no_pen = [0.585, 0.301]
+with_pen = [0.530, 0.360]
shallow = 0.349
fig, axes = plt.subplots(1, 3, figsize=(13, 6.0))
@@ -81,7 +81,7 @@ ax.axhline(shallow, color="black", ls="--", lw=1, label=f"frozen baseline {shall
ax.set_xticks(xpos)
ax.set_xticklabels(methods, fontsize=10)
ax.set_ylabel("test accuracy", fontsize=10)
-ax.set_title("(c) BP+penalty 2$\\times$2 control\n(BP-pen-cost $-8$pp; gap $17$pp $=$ credit quality)", fontsize=10)
+ax.set_title("(c) BP+penalty 2$\\times$2 control\n(BP-pen-cost $-5.5$pp; gap $17$pp $=$ credit quality)", fontsize=10)
ax.legend(loc="upper right", fontsize=8)
ax.grid(True, axis="y", alpha=0.3)
ax.set_ylim(0, 0.7)