Round 38 paper update: §4 + §5 + new Appendix K with SB+penalty 3-seed result

CODEX ROUND 39 VERDICT: PAPER-CHANGING for SB; wait for CB multi-seed for CB claims. Round 38 3-seed SB+penalty (4-block d=256, 30ep, lam=1e-2): - acc 0.453±0.003 (BEATS shallow baseline 0.349 by +10.4pp -- FIRST non-BP method) - ||h_L||=302±8 (contained, not silenced) - ||g_L||=1.8e-4 (HEALTHY) - deep cos +0.322±0.007 (2x DFA+pen +0.155) - deep rho +0.402±0.015 (5x DFA+pen +0.080) Penalty rescue magnitudes (method-dependent): - DFA: +5.5 pp (0.306 -> 0.363) - SB: +24 pp (0.213 -> 0.453) - CB: +15 pp (single seed, multi-seed in flight) - BP: -8 pp (capacity cost, 0.609 -> 0.530) Paper updates: - §4 ¶4 NEW: Mode 2 has method-dependent severity within fixed-feedback family; SB+penalty is the first audited non-BP method to substantively use deep blocks via intervention; deep cos doesn't predict acc across methods (methodological obs) - §5 ¶3 EXTENDED: BP+penalty -> 3x penalty control (BP, DFA, SB) with all margins vs frozen-blocks baseline; BP-to-SB gap only 7.7 pp vs BP-to-DFA gap 17 pp - Appendix K NEW: full SB+penalty 3-seed table with vanilla SB and DFA+pen comparison Main content stays at 9 pages exactly (within E&D limit). Total 16 pages. CB multi-seed (s123, s456) launched in parallel (PIDs 576938, 576939) — claims deferred until those land. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
author: YurenHao0426 <Blackhao0426@gmail.com> 2026-04-08 08:28:08 -0500
committer: YurenHao0426 <Blackhao0426@gmail.com> 2026-04-08 08:28:08 -0500
commit: baa2827a91c931f0b886c8946ebb4a5eb424f853 (patch)
tree: edf87f73474fd3ec3919869fa71c45c992498bb4
parent: aa12974e22de1887b636219096a02c44c595dcf7 (diff)
2 files changed, 31 insertions, 1 deletions
diff --git a/paper/main.pdf b/paper/main.pdf
index 3be679b..6f64e3b 100644
--- a/paper/main.pdf
+++ b/paper/main.pdf
diff --git a/paper/main.tex b/paper/main.tex
index 34efb5b..c4c54b3 100644
--- a/paper/main.tex
+++ b/paper/main.tex
@@ -91,6 +91,8 @@ A second metric with different numerical failure modes tells the same story. Cos
 
 Per-layer reporting is therefore not cosmetic. In ResMLP under vanilla DFA, the headline aggregate alignment $\Gamma \approx 0.07$--$0.10$ can look mildly positive only because layer $0$ remains strongly aligned while the deep network is not: at the same early checkpoints where layers $1$--$4$ are essentially zero, layer $0$ has cosine $+0.42$, $+0.45$, and $+0.39$ across seeds (Table~\ref{tab:mode_validation}). The resulting average can therefore be driven by the embedding layer even when the interior blocks are effectively unaligned, so aggregate reporting obscures the very distinction needed to separate ``measurement collapse'' from ``poor credit direction.'' This layer-$0$ dominance is specific to the ResMLP DFA setting; on ViT-Mini DFA, all layers are near zero, which strengthens the broader methodological point that alignment should be reported per layer rather than only in aggregate. With the two modes separated observationally, the remaining question is whether intervention can move them independently.
 
+Mode~2 has method-dependent severity within the audited fixed-feedback family once Mode~1 is alleviated. Applying the same per-block scale-control penalty $\lambda{=}10^{-2}$ that rescued DFA to State Bridge on the same 4-block $d{=}256$ ResMLP backbone over $30$ epochs and three seeds gives a converged test accuracy of $0.453 \pm 0.003$ and a deep mean cosine of $+0.322 \pm 0.007$ with deep mean $\rho$ of $+0.402 \pm 0.015$, while DFA under the same intervention reaches only $0.363 \pm 0.001$ with deep cosine $+0.155 \pm 0.025$ and deep $\rho$ $+0.080 \pm 0.011$ (Table~\ref{tab:mode_validation}; Appendix~\ref{app:sb_penalty}). The State Bridge penalty rescue is roughly $24$ percentage points above the vanilla State Bridge baseline of $0.213$ on the same architecture and seed and, more importantly for the paper's central walk-back, exceeds the architecture-matched frozen-blocks shallow baseline of $0.349$ by $+10.4$ percentage points. State Bridge with the penalty intervention is therefore the first audited non-BP method whose trained deep blocks substantively improve over an architecture-matched random-block baseline; the headline accuracy gap is comparable to BP+penalty's $+18.1$ pp over the same shallow baseline. Neither the activation scale nor the deep BP gradient magnitude is silenced under the penalty: $\|h_L\|$ stays at $302 \pm 8$ and $\|g_L\|$ at $\sim\!1.8\times 10^{-4}$, both well within the meaningful-measurement regime, so the recovered deep cosine is computed against an informative reference and not against a numerical floor. Within this rescued regime, deep cosine is positive but does not by itself predict end-task accuracy across methods, which strengthens the broader methodological point that alignment must be reported jointly with measurement validity and a depth-utilization baseline rather than as a single headline number.
+
 \section{Intervention and Cross-Architecture Evidence}
 \label{sec:validation}
 
@@ -115,7 +117,7 @@ Fresh-$B$ null control & $\overline{\cos}_{deep}{=}+0.002{\pm}0.022$ ($n{=}20$ d
 
 Once the reference vector is meaningful again, the deep layers no longer sit exactly at null. At $\lambda{=}10^{-2}$, penalized DFA reaches a three-seed deep-layer mean cosine of $+0.155 \pm 0.025$ and deep perturbation correlation of $+0.080 \pm 0.011$, whereas vanilla DFA is essentially zero on both metrics in the deep blocks, consistent with prior concerns that alternative feedback can fail by supplying poor credit directions even before full collapse \citep{bartunov2018assessing,moskovitz2018feedback,crafton2019backpropagation,refinetti2023aligning}. The null calibration rules out the interpretation that this recovered signal is merely measurement noise: on the same penalized checkpoint, replacing the training-time feedback matrices with 20 fresh random $B_l$ draws gives a deep cosine of only $+0.002 \pm 0.022$, with per-layer standard deviations of $0.013$--$0.023$, all within noise of zero (Table~\ref{tab:mode_validation}). The $\lambda$ sweep sharpens the dissociation further: at $\lambda{=}10^{-4}$, Mode~1 is already alleviated, with $\|h_L\|{=}2.4\times 10^4$ and $\|g_L\|{=}6.3\times 10^{-7}$, but deep cosine remains $-0.022$, while at $\lambda{=}10^{-2}$ it rises to $+0.165$ and deep $\rho$ to $+0.091$ (Figure~\ref{fig:penalty_rescue}). The improvement is real, but it is only partial.
 
-A rescue intervention is only informative if its direct cost is controlled. The relevant control is BP trained under the same penalty: BP falls from $0.609 \pm 0.004$ without the penalty to $0.530$ with $\lambda{=}10^{-2}$, so the penalty has a direct cost of about $8$ percentage points even when credit assignment is correct, whereas DFA moves in the opposite direction, from $0.308 \pm 0.014$ to $0.363 \pm 0.001$ under the same intervention (Figure~\ref{fig:penalty_rescue}). Relative to the frozen-blocks baseline of $0.349$, BP+penalty still retains a margin of $+18.1$ points, while DFA+penalty retains only $+1.4$ points. The remaining gap, $0.530 - 0.363 = 17$ points, is therefore a lower bound on the part of DFA's deficit that is not explained by simple penalty-induced capacity loss alone, though not a clean isolation because BP uses an end-to-end loss whereas DFA uses block-local losses. The residual gap after that control is what keeps Mode~2 substantively alive.
+A rescue intervention is only informative if its direct cost is controlled. The relevant control is BP trained under the same penalty: BP falls from $0.609 \pm 0.004$ without the penalty to $0.530$ with $\lambda{=}10^{-2}$, so the penalty has a direct cost of about $8$ percentage points even when credit assignment is correct, whereas DFA moves in the opposite direction, from $0.308 \pm 0.014$ to $0.363 \pm 0.001$, and State Bridge moves further still, from $0.213$ to $0.453 \pm 0.003$ (three seeds), under the same intervention (Figure~\ref{fig:penalty_rescue}; Appendix~\ref{app:sb_penalty}). Relative to the frozen-blocks baseline of $0.349$, BP+penalty retains a margin of $+18.1$ points, State Bridge+penalty retains $+10.4$ points, and DFA+penalty retains only $+1.4$ points. The remaining BP-to-DFA gap of $17$ points is therefore a lower bound on the part of DFA's deficit that is not explained by simple penalty-induced capacity loss alone, though not a clean isolation because BP uses an end-to-end loss whereas DFA uses block-local losses. The substantially smaller BP-to-State-Bridge gap of $0.530 - 0.453 = 7.7$ points shows that the cross-method differences in penalty-rescued accuracy are not all attributable to a uniform ``random-feedback ceiling'': the bridge construction in State Bridge can recover much more of the BP-with-penalty performance than DFA can, on the same architecture and the same intervention. The residual gap after that control is what keeps Mode~2 substantively alive while letting it have method-dependent severity.
 
 The architecture comparison sharpens the scope of the critique. In the terminal-LN architectures we audited, both diagnostics fire for DFA-trained ResMLP at $d{=}256$, the same pattern recurs at $d{=}512$ with even larger max-per-block growth (about $1.5\times 10^4$), and ViT-Mini with a class token and terminal LN shows diagnostic~(a) by epoch~1 and diagnostic~(b) by epochs~2--3 (Figure~\ref{fig:temporal_cross_arch}). A depth sweep on the $d{=}512$ ResMLP at $L \in \{2,4,6,8,12\}$ shows that the layerwise pattern is essentially depth-invariant: DFA's layer-0 cosine stays in $[+0.39,+0.40]$ across all five depths, while its mean deep-layer cosine stays within $[-0.005,+0.000]$ and its deep perturbation correlation collapses to $0.000$ in every depth tested, even though BP retains a deep-layer cosine of $+0.94$ at $L{=}12$ (Appendix~\ref{app:depth_scan}). The deep credit signal does not improve when the network is shallower, so the failure is not a "too deep" artifact. In the non-terminal-LN controls, the pattern is different: StudentNet shows diagnostic~(a) only at epochs~14--25 while diagnostic~(b) never fires across $100$ epochs and three seeds, and the BatchNorm CNN on CIFAR-10 likewise shows strong growth under DFA, with max-per-block growth up to $237\times$, but keeps deepest BP gradients around $\|g\| \sim 10^{-3}$ and never triggers diagnostic~(b) (Figure~\ref{fig:temporal_cross_arch}). BP never triggers either diagnostic in any audited architecture. The matched same-backbone ResMLP-d256 ablation in Section~\ref{sec:mode1} supplies the cleanest causal control: removing terminal LayerNorm from the same architecture preserves activation growth but eliminates the gradient floor, so diagnostic~(b) is necessary on terminal-LN ResMLP and is not just an architecture-class coincidence. The broader claim therefore holds at full strength inside the audited residual ResMLP and ViT-Mini regime, while diagnostic~(a) remains useful more broadly. This lets the paper end with a reporting rule rather than an overclaimed theory.
 
@@ -458,6 +460,34 @@ The cross-method version of the test rules out the explanation that the random-t
 
 The cleanest negative control for the random-target assay is Equilibrium Propagation, which trains the same backbone with a contrastive nudged-vs-free local energy objective rather than a fixed feedback projection. We re-ran EP on the same ResMLP-d256 with i.i.d.\ random class targets, seed 42, identical hyperparameters: at five epochs of training, EP's $\|h_L\|$ stays at about $586$, $25\times$ smaller than DFA's $14{,}510$ at three epochs and consistent with vanilla EP's bounded trajectory on real labels (Table~\ref{tab:random_targets_sbcb_smoke} extension). The random-target assay therefore separates the audited fixed-feedback methods (DFA/SB/CB) from EP cleanly: fixed-feedback objectives without an explicit scale-control term exhibit data-agnostic activation growth on this architecture, while EP's energy-based local objective does not.
 
+\section{State Bridge Penalty Rescue: 3-Seed Cross-Method Test}
+\label{app:sb_penalty}
+
+To test whether the per-block scale-control penalty $\lambda \,\mathrm{mean}(\|f_l(h_l)\|^2)$ that rescues DFA in Section~\ref{sec:validation} also rescues other audited fixed-feedback local-credit methods, we re-ran State Bridge on the standard $4$-block $d{=}256$ pre-LayerNorm ResMLP for $30$ epochs and three seeds (42, 123, 456), with $\lambda{=}10^{-2}$ added to the State Bridge per-block local loss only (the bridge state predictor and the embedding/head paths are not penalized, matching the DFA rescue setup). We also ran a matched vanilla State Bridge baseline at seed 42 with the same architecture and training schedule but $\lambda{=}0$. Three-seed converged values:
+
+\begin{table}[h]
+\centering
+\small
+\caption{State Bridge with the same per-block scale-control penalty $\lambda{=}10^{-2}$ that rescues DFA in Section~\ref{sec:validation}, on the 4-block $d{=}256$ pre-LayerNorm ResMLP, 30 epochs, three seeds. SB+penalty reaches a converged test accuracy of $0.453 \pm 0.003$, exceeding the architecture-matched frozen-blocks shallow baseline of $0.349$ by $+10.4$ percentage points and the DFA+penalty value of $0.363 \pm 0.001$ by $+9.0$ percentage points. The deep mean cosine and deep mean perturbation correlation are roughly $2\times$ and $5\times$ the corresponding DFA+penalty values respectively, while the residual stream is contained but not silenced ($\|h_L\|\!\approx\!302$, $\|g_L\|\!\approx\!1.8\times 10^{-4}$). Vanilla SB on the same architecture and seed reaches only $0.213$, with $\|h_L\|\!\approx\!9.85\times 10^6$ and $\|g_L\|$ at the diagnostic-(b) floor.}
+\label{tab:sb_penalty}
+\begin{tabular}{lrrrrr}
+\toprule
+seed & test acc & $\|h_L\|$ & $\|g_L\|$ & deep cos & deep $\rho$ \\
+\midrule
+SB+pen $42$ & $0.4564$ & $302$ & $1.75\times 10^{-4}$ & $+0.312$ & $+0.392$ \\
+SB+pen $123$ & $0.4514$ & $311$ & $1.74\times 10^{-4}$ & $+0.327$ & $+0.424$ \\
+SB+pen $456$ & $0.4509$ & $292$ & $1.92\times 10^{-4}$ & $+0.326$ & $+0.391$ \\
+\midrule
+SB+pen mean & $0.453 \pm 0.003$ & $302 \pm 8$ & $1.80\times 10^{-4}$ & $+0.322 \pm 0.007$ & $+0.402 \pm 0.015$ \\
+\midrule
+vanilla SB $42$ & $0.213$ & $9.85\times 10^6$ & $1\times 10^{-8}$ & --- & --- \\
+DFA+pen mean (3 seeds) & $0.363 \pm 0.001$ & $4.0\times 10^4$ & $9.0\times 10^{-7}$ & $+0.155 \pm 0.025$ & $+0.080 \pm 0.011$ \\
+\bottomrule
+\end{tabular}
+\end{table}
+
+The penalty rescue effect on State Bridge is much larger than on DFA: $+24$ percentage points for State Bridge versus $+5.5$ percentage points for DFA on the same architecture and intervention. SB+penalty is the first audited non-BP method whose trained deep blocks substantively beat the architecture-matched random-block baseline. We treat this as evidence that Mode~2 (low intrinsic credit-direction quality) has method-dependent severity within the audited fixed-feedback family once Mode~1 is alleviated, rather than being a uniform property of all fixed-feedback local-credit objectives. Importantly, State Bridge's deep cosine $+0.322$ is approximately twice DFA's $+0.155$ on the same intervention, but neither approaches the BP reference value of $\approx +1.0$, so this is a within-class gradation in credit-direction quality, not a claim that bridge constructions ``solve'' Mode~2. Verifying whether Credit Bridge under the same intervention shows a similar within-class gradation is in-flight at the time of writing; results will be reported as a multi-seed extension of Table~\ref{tab:sb_penalty}.
+
 \section{Reproducibility}
 \label{app:reproducibility}
author	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 08:28:08 -0500
committer	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 08:28:08 -0500
commit	baa2827a91c931f0b886c8946ebb4a5eb424f853 (patch)
tree	edf87f73474fd3ec3919869fa71c45c992498bb4
parent	aa12974e22de1887b636219096a02c44c595dcf7 (diff)