diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 19:24:06 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 19:24:06 -0500 |
| commit | 2fa24acae8bb7f8c026db2f7fdade4a29b640d8d (patch) | |
| tree | 98bf266ac07a1d6974769262dff916553223612f /protocol/examples/threshold_d_sensitivity.py | |
| parent | cebc4c4a81809a982a16dd07da41487aa2f30322 (diff) | |
Sync experiment+protocol scripts with v2.32 corrected control values
The pre-v2.31 unsourced values BP=0.609 and DFA=0.308 (which v2.31 fixed
to 0.585 and 0.301 via matched 30-ep controls) were also hardcoded as
"compare to" comments in 5 helper scripts:
experiments/bp_with_penalty_control.py
experiments/dfa_residual_penalty_test.py
experiments/resmlp_frozen_blocks_baseline.py
protocol/examples/threshold_d_sensitivity.py
protocol/examples/plot_penalty_rescue.py
These are non-paper-input scripts (their output goes to stdout, not to
the paper), so the stale values didn't cause numerical errors in the
paper itself. But the original v2.31 BP+pen=0.609 unsourced number bug
came from exactly this kind of hardcoded "for-comparison" comment that
was never measured. Updating them now to remove the same trap from
future runs.
Each script now references the matched 30-ep 3-seed values from
results/bp_no_penalty_30ep, results/dfa_no_penalty_30ep, results/
dfa_pen_short, and results/bp_with_penalty.
protocol/EVIDENCE_SUMMARY.md and PAPER_OUTLINE.md still have stale
numbers — these are project scratch documents and not user-facing.
Deferred to a separate sweep if needed.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'protocol/examples/threshold_d_sensitivity.py')
| -rw-r--r-- | protocol/examples/threshold_d_sensitivity.py | 14 |
1 files changed, 8 insertions, 6 deletions
diff --git a/protocol/examples/threshold_d_sensitivity.py b/protocol/examples/threshold_d_sensitivity.py index d3f2c58..065efc7 100644 --- a/protocol/examples/threshold_d_sensitivity.py +++ b/protocol/examples/threshold_d_sensitivity.py @@ -22,13 +22,15 @@ REPO_ROOT = os.path.dirname( def main(): # 3-seed mean accuracies on 4-block d=256 ResMLP CIFAR-10 + # Updated v2.32 with matched 30-epoch controls conditions = [ - ("BP-trainable", 0.609, 0.004), - ("DFA-shallow", 0.349, 0.002), - ("DFA-vanilla", 0.308, 0.014), - ("DFA-pen lam=1e-3", 0.372, None), # 1 seed - ("DFA-pen lam=1e-2", 0.363, 0.0007), - ("DFA-frozen-rand", 0.349, 0.002), + ("BP-trainable 100ep", 0.6147, 0.004), # protocol_audit + ("BP-trainable 30ep", 0.585, 0.001), # results/bp_no_penalty_30ep + ("BP+pen 30ep lam=1e-2", 0.532, 0.006), # results/bp_with_penalty + ("DFA-shallow", 0.349, 0.002), # frozen baseline + ("DFA-vanilla 100ep", 0.306, 0.006), # protocol_audit + ("DFA-vanilla 30ep", 0.301, 0.005), # results/dfa_no_penalty_30ep + ("DFA+pen 30ep lam=1e-2", 0.360, 0.001), # results/dfa_pen_short ] shallow_acc = 0.349 |
