Round 29: fill in §4 Failure Mode 2 prose (3 paragraphs) via codex

author: YurenHao0426 <Blackhao0426@gmail.com> 2026-04-08 04:57:50 -0500
committer: YurenHao0426 <Blackhao0426@gmail.com> 2026-04-08 04:57:50 -0500
commit: 1685abecc234d63daede821bc41d90a98c576528 (patch)
tree: 4c823c6a59083b6385af5fff4299377d46f111f9 /paper/main.tex
parent: 8f52b493dbf1a21a762dd2c9e924bbbbebeb911d (diff)
1 files changed, 3 insertions, 5 deletions
diff --git a/paper/main.tex b/paper/main.tex
index 19324ec..dc1ba7c 100644
--- a/paper/main.tex
+++ b/paper/main.tex
@@ -87,13 +87,11 @@ The collapse is not a late-epoch curiosity. For vanilla DFA on the ResMLP tempor
 \section{Failure Mode 2: Low Intrinsic Credit-Direction Quality}
 \label{sec:mode2}
 
-The second failure mode is low intrinsic credit-direction quality on the deep blocks even when the BP reference gradient is still in a meaningful regime. % TODO: evidence sentence % TODO: closing sentence
+The second failure mode appears even in the meaningful-measurement regime. At the earliest vanilla DFA checkpoints on ResMLP, the hidden backpropagated gradient at the first deep block remains above the numerical floor: at epoch 1, $\|g_2\|$ is $6.7\times 10^{-7}$, $6.5\times 10^{-7}$, and $3.9\times 10^{-7}$ across the three seeds, all above the $10^{-7}$ threshold used to distinguish measurable from collapsed gradients. Yet the corresponding deep-layer cosine values are already essentially null: across layers $1$--$4$, all seed-level measurements at epoch 1 lie in $[-0.04,+0.02]$, with a three-seed mean of $-0.008 \pm 0.013$, and by epoch 2 the deep mean is still only $-0.018 \pm 0.018$ (Table~\ref{tab:mode_validation}). This is the observational pattern predicted by low credit-direction quality rather than mere disappearance of signal: the gradient is still present enough to measure, but the directions delivered to the deep network carry little agreement with backpropagation, consistent with prior concerns that alternative feedback rules can fail by supplying poor credit assignments even before full collapse \citep{bartunov2018assessing,moskovitz2018feedback,crafton2019backpropagation,refinetti2023align}. This rules out the simplest objection that the deep-layer null result is merely a byproduct of collapse.
 
-This mode appears most clearly in early-epoch or partially rescued settings, where the deepest-layer BP gradient remains measurable yet the random-feedback credit signal is still close to null or unstable, implying that the method is failing as a direction estimator rather than merely being scored with a broken ruler \citep{bartunov2018assessing,moskovitz2018feedback,crafton2019backpropagation}. % TODO: evidence sentence % TODO: closing sentence
+A second metric with different numerical failure modes tells the same story. Cosine measures directional agreement with the BP gradient, whereas perturbation correlation $\rho$ measures whether the proposed update predicts the correct sign and relative magnitude of loss change under actual perturbations; their failure modes are therefore different, especially with respect to normalization and small-denominator effects. In our controls, $\rho$ behaves as expected, with a Taylor-ceiling positive control near $+0.997$ and a random-vector negative control near $+0.006$ (Figure~\ref{fig:penalty_rescue}, Table~\ref{tab:mode_validation}). On vanilla DFA, deep $\rho$ is likewise null: for the early checkpoints where the gradients remain measurable, the deep average is $-0.003 \pm 0.005$ across seeds and epochs, and in a floor-level checkpoint it is $+0.002$, again indistinguishable from noise. The agreement between cosine and $\rho$ therefore rules out the interpretation that the null deep result is an artifact of cosine's $\varepsilon$-clamp or vector normalization. The deep blocks are not just hard to measure; they are receiving weakly useful directions.
 
-The conceptual payoff of the paper is that these are mechanistically distinct failures that the status-quo pair collapses into one ambiguous story about undertraining. % TODO: evidence sentence % TODO: closing sentence
-
-Separating the modes matters because the interventions differ: numerical rescue can restore measurability without producing strong deep credit directions, while better direction quality would need to improve alignment even before any measurement-floor pathology is present. % TODO: evidence sentence % TODO: closing sentence
+Per-layer reporting is therefore not cosmetic. In ResMLP under vanilla DFA, the headline aggregate alignment $\Gamma \approx 0.07$--$0.10$ can look mildly positive only because layer $0$ remains strongly aligned while the deep network is not: at the same early checkpoints where layers $1$--$4$ are essentially zero, layer $0$ has cosine $+0.42$, $+0.45$, and $+0.39$ across seeds (Table~\ref{tab:mode_validation}). The resulting average can therefore be driven by the embedding layer even when the interior blocks are effectively unaligned, so aggregate reporting obscures the very distinction needed to separate ``measurement collapse'' from ``poor credit direction.'' This layer-$0$ dominance is specific to the ResMLP DFA setting; on ViT-Mini DFA, all layers are near zero, which strengthens the broader methodological point that alignment should be reported per layer rather than only in aggregate. With the two modes separated observationally, the remaining question is whether intervention can move them independently.
 
 \section{Intervention and Cross-Architecture Evidence}
 \label{sec:validation}
author	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 04:57:50 -0500
committer	YurenHao0426 <Blackhao0426@gmail.com>	2026-04-08 04:57:50 -0500
commit	1685abecc234d63daede821bc41d90a98c576528 (patch)
tree	4c823c6a59083b6385af5fff4299377d46f111f9 /paper/main.tex
parent	8f52b493dbf1a21a762dd2c9e924bbbbebeb911d (diff)