diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 21:21:22 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 21:21:22 -0500 |
| commit | 4bbe7f0e7b9985f790b528f639bde39717a8f379 (patch) | |
| tree | e1053f360b912ef08591735fa7ba77f398e6c73e | |
| parent | 5995929511404ba3e0b8b4f1bfef69dbf291c7a9 (diff) | |
paper v2.37.1: abstract mentions nudging + training-loss confirmation
Earlier (during the page-budget-constrained polish loop) I tried to add
the nudging-test mention to the abstract but had to revert because it
pushed §7 onto p10. With page budget relaxed, re-attempting the update.
Old abstract sentence about Mode 2 dissociation:
"...while Credit Bridge attains much higher deep BP cosine than DFA
at the same final accuracy, a dissociation that motivates reporting
layerwise credit quality jointly with a depth-utilization baseline."
New abstract sentence:
"...while Credit Bridge attains roughly 4× DFA's deep BP cosine yet
matches DFA's accuracy—a dissociation that single-step nudging and
integrated training-loss decrease both confirm against the reverse
cosine ordering, and that motivates reporting layerwise credit quality
jointly with a depth-utilization baseline."
This now references the v2.33 functional triangulation in the abstract,
matching the §4 main-text framing. A reader of just the abstract now
sees the strongest form of the cos-vs-acc dissociation: it's not just
"CB has higher cos but same acc" (which could be a noisy single
measurement) but "three independent functional metrics rank the
methods opposite to deep cosine".
Page count: 20 (unchanged).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| -rw-r--r-- | paper/main.pdf | bin | 537138 -> 537258 bytes | |||
| -rw-r--r-- | paper/main.tex | 2 |
2 files changed, 1 insertions, 1 deletions
diff --git a/paper/main.pdf b/paper/main.pdf Binary files differindex 30f1465..631684a 100644 --- a/paper/main.pdf +++ b/paper/main.pdf diff --git a/paper/main.tex b/paper/main.tex index a63d0b5..039ce2a 100644 --- a/paper/main.tex +++ b/paper/main.tex @@ -27,7 +27,7 @@ \maketitle \begin{abstract} -Modern feedback-alignment evaluation on deep residual networks is still summarized by a deceptively simple pair: headline accuracy and headline cosine alignment $\Gamma$ to the backpropagation gradient. We show that this pair can silently fail in two distinct ways on standard CIFAR-10 pre-LayerNorm ResMLP and ViT-Mini settings: first, \emph{measurement degeneracy}, where residual-stream growth drives hidden-layer BP gradients to the numerical floor and makes $\Gamma$ uninterpretable; and second, \emph{low intrinsic credit-direction quality}, where random-feedback credit remains essentially unaligned with BP on the deep blocks even when the reference gradient is still meaningful. The headline result is that the field-standard reporting pair walks back none of the methods we audit, whereas a four-diagnostic protocol walks back the three degenerate methods and passes the two trustworthy controls. Intervention with a per-block scale-control penalty further reveals method-dependent severity within the audited fixed-feedback family: State Bridge then exceeds the architecture-matched frozen-blocks baseline by about $10$ percentage points, while Credit Bridge attains much higher deep BP cosine than DFA at the same final accuracy, a dissociation that motivates reporting layerwise credit quality jointly with a depth-utilization baseline. Our contribution is an evaluation methodology paper for the NeurIPS 2026 Evaluations \& Datasets track: we provide the protocol, the calibration logic for its thresholds, a reference implementation, a five-method audit, and validation through temporal replay, cross-architecture checks, intervention-based disambiguation, and a documented catalog of pipeline pitfalls, in the spirit of critical evaluation analyses such as \citet{jordan2020evaluating,obray2022evaluation,paleka2026pitfalls}. +Modern feedback-alignment evaluation on deep residual networks is still summarized by a deceptively simple pair: headline accuracy and headline cosine alignment $\Gamma$ to the backpropagation gradient. We show that this pair can silently fail in two distinct ways on standard CIFAR-10 pre-LayerNorm ResMLP and ViT-Mini settings: first, \emph{measurement degeneracy}, where residual-stream growth drives hidden-layer BP gradients to the numerical floor and makes $\Gamma$ uninterpretable; and second, \emph{low intrinsic credit-direction quality}, where random-feedback credit remains essentially unaligned with BP on the deep blocks even when the reference gradient is still meaningful. The headline result is that the field-standard reporting pair walks back none of the methods we audit, whereas a four-diagnostic protocol walks back the three degenerate methods and passes the two trustworthy controls. Intervention with a per-block scale-control penalty further reveals method-dependent severity within the audited fixed-feedback family: State Bridge then exceeds the architecture-matched frozen-blocks baseline by about $10$ percentage points, while Credit Bridge attains roughly $4\times$ DFA's deep BP cosine yet matches DFA's accuracy---a dissociation that single-step nudging and integrated training-loss decrease both confirm against the reverse cosine ordering, and that motivates reporting layerwise credit quality jointly with a depth-utilization baseline. Our contribution is an evaluation methodology paper for the NeurIPS 2026 Evaluations \& Datasets track: we provide the protocol, the calibration logic for its thresholds, a reference implementation, a five-method audit, and validation through temporal replay, cross-architecture checks, intervention-based disambiguation, and a documented catalog of pipeline pitfalls, in the spirit of critical evaluation analyses such as \citet{jordan2020evaluating,obray2022evaluation,paleka2026pitfalls}. \end{abstract} \section{Introduction} |
