diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 19:11:40 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 19:11:40 -0500 |
| commit | d022688ea9fcfcb81f900751ee92e35597ef19b8 (patch) | |
| tree | 192e620a4dc4915d37c590fdc2621abeaa15c8c3 /paper/figures | |
| parent | 6a057a379e58dc464f04e5208861699b01b5d477 (diff) | |
paper v2.32: BP+penalty multi-seeded (was single-seed s42)
The §5 ¶3 BP+penalty value (0.530, +18.1 pp margin) was single-seed s42.
Ran s123 and s456 to multi-seed it, matching the BP-no-pen 3-seed control.
3-seed BP+pen 30ep results (lam=0.01, AdamW lr=1e-3 wd=0.01, cosine, batch 128):
s42: 0.5303, +18.13 pp vs frozen
s123: 0.5262, +17.72 pp
s456: 0.5397, +19.07 pp
3-seed mean: 0.5321 ± 0.0057, +18.31 pp
Updates:
- §5 ¶3: BP+pen "0.530 (single seed)" → "0.532 ± 0.006" (3-seed)
- §5 ¶3: BP penalty cost -5.5 pp → -5.3 pp
- §5 ¶3: BP+pen margin +18.1 → +18.3 pp
- §5 ¶3: BP-to-DFA gap 17.0 → 17.2 pp
- §4 ¶4: BP+pen +18.1 → +18.3 pp comparison
- Figure 3 panel C bar values: BP with_pen 0.530 → 0.532
- Figure 3 panel C title: BP-pen-cost -5.5pp → -5.3pp
The +18.3 pp 3-seed mean is essentially the same as the s42 single-seed
+18.13 pp, so the headline conclusion (BP+pen far above frozen baseline,
huge gap vs DFA+pen) is unchanged. This commit removes the last
single-seed value labeled as a key control.
New auditable file: results/bp_with_penalty_3seed_summary.json
Page layout preserved: 9 pages main, refs p10, 0 overfull boxes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'paper/figures')
| -rw-r--r-- | paper/figures/fig4_penalty_rescue.pdf | bin | 34199 -> 34201 bytes | |||
| -rw-r--r-- | paper/figures/render_fig4_penalty_rescue.py | 4 |
2 files changed, 2 insertions, 2 deletions
diff --git a/paper/figures/fig4_penalty_rescue.pdf b/paper/figures/fig4_penalty_rescue.pdf Binary files differindex deef685..8e77fc7 100644 --- a/paper/figures/fig4_penalty_rescue.pdf +++ b/paper/figures/fig4_penalty_rescue.pdf diff --git a/paper/figures/render_fig4_penalty_rescue.py b/paper/figures/render_fig4_penalty_rescue.py index 614698c..b0f46cd 100644 --- a/paper/figures/render_fig4_penalty_rescue.py +++ b/paper/figures/render_fig4_penalty_rescue.py @@ -33,7 +33,7 @@ rho_err = [0.005, 0.0, 0.011, 0.0, 0.0] # Panel C: 2x2 capacity-cost control methods = ["BP", "DFA"] no_pen = [0.585, 0.301] -with_pen = [0.530, 0.360] +with_pen = [0.532, 0.360] shallow = 0.349 fig, axes = plt.subplots(1, 3, figsize=(13, 6.0)) @@ -81,7 +81,7 @@ ax.axhline(shallow, color="black", ls="--", lw=1, label=f"frozen baseline {shall ax.set_xticks(xpos) ax.set_xticklabels(methods, fontsize=10) ax.set_ylabel("test accuracy", fontsize=10) -ax.set_title("(c) BP+penalty 2$\\times$2 control\n(BP-pen-cost $-5.5$pp; gap $17$pp $=$ credit quality)", fontsize=10) +ax.set_title("(c) BP+penalty 2$\\times$2 control\n(BP-pen-cost $-5.3$pp; gap $17$pp $=$ credit quality)", fontsize=10) ax.legend(loc="upper right", fontsize=8) ax.grid(True, axis="y", alpha=0.3) ax.set_ylim(0, 0.7) |
