From 2b4581723d0c5ed562528fac6b0a789adf95e3c5 Mon Sep 17 00:00:00 2001
From: YurenHao0426 <Blackhao0426@gmail.com>
Date: Wed, 8 Apr 2026 18:25:49 -0500
Subject: =?UTF-8?q?paper=20v2.31.8:=20Appendix=20I=20EP=20random-target=20?=
 =?UTF-8?q?=E2=80=96h=5FL=E2=80=96=20values=20from=20saved=20JSON?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Appendix I claimed EP random-target ‖h_L‖ "≈586 at 5 ep" and "≈2,085
at 100 ep" without a saved-JSON source. Re-measured on the saved
checkpoints with consistent methodology (model.eval(), n=2048 test
median), giving 557 (5 ep) and 2151 (100 ep). The ~5% discrepancy
is likely model.train() vs model.eval() LN-batch-stats; the new
values are reproducible.

Saved results/ep_random_h_L_summary.json as the source of truth.
The "26× smaller than DFA's 14,510 at 3 ep" comparison still holds
(was "25×"; updated to "26×" with the new EP values).

The fixed-feedback vs energy-based separation conclusion is unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---
 paper/main.pdf                     | Bin 500998 -> 501174 bytes
 paper/main.tex                     |   2 +-
 results/ep_random_h_L_summary.json |  10 ++++++++++
 3 files changed, 11 insertions(+), 1 deletion(-)
 create mode 100644 results/ep_random_h_L_summary.json

diff --git a/paper/main.pdf b/paper/main.pdf
index e744481..422ff34 100644
Binary files a/paper/main.pdf and b/paper/main.pdf differ
diff --git a/paper/main.tex b/paper/main.tex
index 0ff0e13..f40cab8 100644
--- a/paper/main.tex
+++ b/paper/main.tex
@@ -484,7 +484,7 @@ Credit Bridge  & $19{,}974$ & $3.2\times 10^{-6}$ & $0.092$ \\
 
 The cross-method version of the test rules out the explanation that the random-target growth is specific to DFA's particular feedback projection. State Bridge and Credit Bridge use bridge constructions with target normalization and stop-gradients, so any residual-stream growth they exhibit cannot be attributed to a simple absence of normalization. Their $\|g_L\|$ values at three epochs are still well above the $10^{-7}$ floor used by diagnostic~(b), so the gradient collapse part of Mode~1 does not yet appear at this horizon for SB/CB; the activation-growth part of Mode~1 is already present. At the full $100$-epoch trajectory of the same random-target protocol, both SB and CB also reach the (b) floor: SB converges to $\|h_L\|\approx 3.6\times 10^5$ and $\|g_L\|\approx 4\times 10^{-8}$, and CB converges to $\|h_L\|\approx 1.38\times 10^8$ and $\|g_L\|\approx 0$ (below the numerical clamp), with test accuracies $0.100$ and $0.085$ respectively, consistent with DFA's $1.67\times 10^8$ and $8.0\times 10^{-12}$ at the same horizon. We treat this as evidence that the local-credit growth incentive is not unique to DFA but is shared by the audited family of fixed-feedback methods.
 
-The cleanest negative control for the random-target assay is Equilibrium Propagation, which trains the same backbone with a contrastive nudged-vs-free local energy objective rather than a fixed feedback projection. We re-ran EP on the same ResMLP-d256 with i.i.d.\ random class targets, seed 42, identical hyperparameters: EP's $\|h_L\|$ stays at about $586$ at five epochs of training and converges to about $2{,}085$ over the full $100$-epoch trajectory, which is roughly $25\times$ smaller than DFA's $14{,}510$ at three epochs and is in the same range as vanilla EP's bounded trajectory on real labels ($\sim\!5\times 10^3$). At convergence, the random-target EP run reaches headline accuracy $0.081$, headline $\Gamma{=}{-}0.0003$, and headline $\rho{=}{-}0.006$, all consistent with chance-level performance and a non-degenerate measurement regime. The random-target assay therefore separates the audited fixed-feedback methods (DFA/SB/CB) from EP cleanly: fixed-feedback objectives without an explicit scale-control term exhibit data-agnostic activation growth on this architecture, while EP's energy-based local objective does not.
+The cleanest negative control for the random-target assay is Equilibrium Propagation, which trains the same backbone with a contrastive nudged-vs-free local energy objective rather than a fixed feedback projection. We re-ran EP on the same ResMLP-d256 with i.i.d.\ random class targets, seed 42, identical hyperparameters: EP's $\|h_L\|$ stays at about $557$ at five epochs of training and converges to about $2{,}151$ over the full $100$-epoch trajectory (median over $n{=}2048$ test inputs, model in eval mode; see \texttt{results/ep\_random\_h\_L\_summary.json}), which is roughly $26\times$ smaller than DFA's $14{,}510$ at three epochs and is in the same range as vanilla EP's bounded trajectory on real labels ($\sim\!5\times 10^3$). At convergence, the random-target EP run reaches headline accuracy $0.081$, headline $\Gamma{=}{-}0.0003$, and headline $\rho{=}{-}0.006$, all consistent with chance-level performance and a non-degenerate measurement regime. The random-target assay therefore separates the audited fixed-feedback methods (DFA/SB/CB) from EP cleanly: fixed-feedback objectives without an explicit scale-control term exhibit data-agnostic activation growth on this architecture, while EP's energy-based local objective does not.
 
 \section{State Bridge and Credit Bridge Penalty Rescue: 3-Seed Cross-Method Test}
 \label{app:sb_penalty}
diff --git a/results/ep_random_h_L_summary.json b/results/ep_random_h_L_summary.json
new file mode 100644
index 0000000..96449e0
--- /dev/null
+++ b/results/ep_random_h_L_summary.json
@@ -0,0 +1,10 @@
+{
+  "EP_random_5ep": {
+    "h_L_median": 556.95361328125,
+    "h_L_mean": 554.086181640625
+  },
+  "EP_random_100ep": {
+    "h_L_median": 2151.283935546875,
+    "h_L_mean": 2126.281982421875
+  }
+}
\ No newline at end of file
-- 
cgit v1.2.3