From 02c3d2c80805daedb2b6c8e9d6e5f36c52d361a1 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Wed, 8 Apr 2026 06:11:25 -0500 Subject: =?UTF-8?q?Round=2036:=20upgrade=20(b)=20wording=20+=20add=20EP=20?= =?UTF-8?q?random-target=20neg=20control=20to=20=C2=A73?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two changes from round 36: 1. §3 paragraph 3: replace 'observational association' with full causal claim based on existing April 7 no-out_ln data (3 seeds, ResMLP-d256+terminal-LN removed, residual skip kept): ||h_L||=1.21e7 (Mode 1 (a) still fires) but ||g_L||=7.4e-4 (HEALTHY, ~10000x above floor — (b) eliminated). Final acc 0.327±0.013 indistinguishable from vanilla DFA's 0.308±0.014. Wording upgraded to 'terminal LayerNorm is necessary for Mode 1(b) in the audited residual ResMLP and ViT-Mini setting'. 2. §3 paragraph after random-target ablation: add EP under random targets smoke result (||h_L||=586 at ep 5 vs DFA's 14510 at ep 3, 25x gap). Random-target assay now cleanly separates fixed-feedback methods (explode) from EP (bounded). Cross-method negative control complete. - experiments/ep_baseline.py: add --random_targets flag + train_ep parameter - v2.5 paper compiles to 15 pages, main content 1-9 (right at E&D limit) Combined picture (rounds 32-36): - Mode 1 (a) localized to 'fixed-feedback local-credit objectives without scale control on architectures absorbing scale at output'. Falsified: residual skip (round 33), task signal (round 34), DFA-specific (round 35). EP is the working negative control (round 36). - Mode 1 (b) localized to terminal LayerNorm via the 1/||h|| Jacobian. Causally established by April 7 no_outln 3-seed data. Co-Authored-By: Claude Opus 4.6 (1M context) --- paper/main.pdf | Bin 465275 -> 467841 bytes 1 file changed, 0 insertions(+), 0 deletions(-) (limited to 'paper/main.pdf') diff --git a/paper/main.pdf b/paper/main.pdf index c8f6b89..b0a2973 100644 Binary files a/paper/main.pdf and b/paper/main.pdf differ -- cgit v1.2.3