diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 18:16:55 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 18:16:55 -0500 |
| commit | 96bb72683e7356719f94dab15bfe3c8c4266fd88 (patch) | |
| tree | fc783e46013e38cfaa44c676a07d84a17857d67e /protocol/examples/threshold_sensitivity.py | |
| parent | 9c6d49bf201e5a9407ea02f6e7aa0d52e55f2038 (diff) | |
paper v2.31.3: §2 ¶3 per-block growth values were architecture mix-up
The paper claimed DFA/SB/CB had max-per-block growth of "237×, 12000×,
96×" on the 4-block d=256 ResMLP. Re-aggregating from the protocol audit
JSON (results/protocol_audit/audit_table_s42_s123_s456.json) gives:
DFA d=256: max growth 2043, 979, 2545 → 3-seed mean ~1856 (≈1.9e3)
SB d=256: max growth 12781, 24126, 10467 → mean ~15791 (≈1.6e4)
CB d=256: max growth 1820, 695, 1034 → mean ~1183 (≈1.2e3)
The paper's "237" and "96" actually match the BatchNorm CNN audit
(audit_cnn_3seed.json gives DFA 214/235/263 → mean 237 and CB 108/90/91
→ mean 96), not the d=256 ResMLP. SB "12000" was close to ResMLP s42
single-seed (12781) but the other two values were apparently picked
from the wrong architecture. This was an architecture mix-up that
under-reported the d=256 ResMLP per-block growth by ~8x for DFA and
~12x for CB.
Updated to the actual 3-seed mean values from the matched d=256 audit.
The numbers are now an order of magnitude larger and more clearly
"extreme" than the original mistaken values.
The CNN per-block growth claim of "up to 237×" in §5 ¶3 (which says
"the BatchNorm CNN ... shows strong growth under DFA, with max-per-
block growth up to 237×") is correct — that 237 is the right value
for the CNN context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'protocol/examples/threshold_sensitivity.py')
0 files changed, 0 insertions, 0 deletions
