summaryrefslogtreecommitdiff
path: root/paper
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 04:50:23 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-08 04:50:23 -0500
commit1eb0c06b341b90fc5ebbe689154aab6c8b6830c0 (patch)
tree8ae51270add6dd1ff2084582e2dbfe579311e1d1 /paper
parent07b10f06478514bbe9d9c77461a90f9d3254218b (diff)
Round 26: fill in §1 Introduction prose (3 paragraphs) via codex
Codex round 26 produced 3 substantive paragraphs for §1, replacing the 3 thin placeholder sentences. Each paragraph follows round 23's prescription: P1: claim sentence + numerical evidence (DFA 0.306 < frozen 0.349; layer-0 +0.42 vs deep ~0; ||g_L|| ~ 5e-10 < eps clamp 1e-8) + closing 'measurement regime must be valid' P2: 5-method audit shows the two modes; intervention dissociation (lambda=1e-4 alleviates Mode 1 not Mode 2; vanilla ep 1 has meaningful ||g|| but deep cos still ~0) + closing P3: methodological contribution framing + cite Paleka, O'Bray, Jordan + closing roadmap Compiles cleanly. PDF still has §2-§7 with topic sentences only (TODO next via per-section codex rounds).
Diffstat (limited to 'paper')
-rw-r--r--paper/main.pdfbin415093 -> 424493 bytes
-rw-r--r--paper/main.tex6
2 files changed, 3 insertions, 3 deletions
diff --git a/paper/main.pdf b/paper/main.pdf
index c737bb3..7ced00e 100644
--- a/paper/main.pdf
+++ b/paper/main.pdf
Binary files differ
diff --git a/paper/main.tex b/paper/main.tex
index b0a7787..21dabd9 100644
--- a/paper/main.tex
+++ b/paper/main.tex
@@ -30,11 +30,11 @@ Modern feedback-alignment evaluation on deep residual networks is still summariz
\section{Introduction}
\label{sec:intro}
-Modern feedback-alignment evaluation on residual networks still rests on a field-standard pair: headline accuracy and headline $\Gamma$ \citep{lillicrap2016random,nokland2016direct,akrout2019deep,launay2020direct}. % TODO: evidence sentence % TODO: closing sentence
+Feedback-alignment papers are usually judged by two numbers: task accuracy and an aggregate similarity between the method's local credit signal and the backpropagation gradient \citep{lillicrap2016random,nokland2016direct,akrout2019deep,launay2020direct}. On the audited 4-block $d{=}256$ ResMLP, however, Table~\ref{tab:main_audit} already shows that this pair is not a validity check: DFA reaches only $0.306 \pm 0.006$ test accuracy, below the architecture-matched frozen-blocks baseline of $0.349 \pm 0.002$, while still looking superficially comparable to other non-BP methods. Figure~\ref{fig:audit_hero} further shows that the apparent cosine evidence is concentrated at the shallowest block, with DFA at seed 42 reaching about $+0.42$ at layer 0 but approximately $-0.03$ to $0$ on layers 1--4, so the aggregate obscures where credit direction is and is not present. At the same time, the deepest BP reference norm is only about $5 \times 10^{-10}$ for DFA, State Bridge, and Credit Bridge, below the $10^{-8}$ clamp used by \texttt{F.cosine\_similarity}, whereas BP remains around $4 \times 10^{-4}$, so the reported deep cosine is partly computed against a numerical-floor reference rather than an informative gradient direction (Figure~\ref{fig:audit_hero}; Table~\ref{tab:main_audit}). Those numbers can be useful, but only if the measurement regime itself is valid.
-Both numbers can silently mislead on the same trained network. % TODO: evidence sentence % TODO: closing sentence
+Our audit shows that modern residual vision models can make these two quantities look informative while failing to answer the question they are taken to answer. Figure~\ref{fig:audit_hero} shows the first failure mode, which we call \emph{Mode 1: measurement degeneracy}, where residual-stream growth drives the deepest hidden state to about $\|h_L\| \sim 10^8$ under DFA/SB/CB while the corresponding BP reference collapses to $\|g_L\| \sim 5 \times 10^{-10}$, so the deep-layer cosine is measured against a clamp-dominated floor rather than a meaningful target direction. The same figure also shows the second failure mode, \emph{Mode 2: low intrinsic credit-direction quality}, because even after comparing against the stronger frozen-blocks baseline ($0.349 \pm 0.002$) and looking layer-by-layer, DFA's deep blocks remain essentially null while only layer 0 is visibly positive. To test whether this is only a measurement problem, the intervention results show a dissociation: with a residual penalty $\lambda \|f_l(h_l)\|^2$, the deepest state scale falls toward $4 \times 10^4$, the reference gradient rises toward $10^{-6}$, and deep cosine can improve to about $+0.16$, yet at $\lambda{=}10^{-4}$ Mode 1 is alleviated while deep cosine still stays near zero, and at vanilla DFA epoch 1 the reference is already meaningful at about $6 \times 10^{-7}$ but the deep cosine is still $-0.008 \pm 0.013$ across three seeds. The failure is not unitary: one mode breaks the measurement, and the other survives even when the measurement is still meaningful.
-This paper argues that standard FA evaluation conflates two distinct failure modes and that the right scientific object for this track is the evaluation protocol itself rather than a new benchmark or dataset \citep{jordan2020evaluating,obray2022evaluation,paleka2026pitfalls}. % TODO: evidence sentence % TODO: closing sentence
+Accordingly, this paper does not introduce a new FA variant or a new benchmark. Instead, Table~\ref{tab:main_audit} and Figure~\ref{fig:audit_hero} use a standard five-method CIFAR-10 audit to show that status-quo reporting would treat BP, EP, DFA, State Bridge, and Credit Bridge as the same kind of evidence-bearing object even though only BP and EP remain trustworthy under matched diagnostic checks. This makes the contribution methodological in the sense of \citet{jordan2020evaluating}, \citet{obray2022evaluation}, and \citet{paleka2026pitfalls}: the central question is not whether one more FA variant can post a headline number, but whether the reporting pipeline distinguishes meaningful credit-direction evidence from numerical-floor artifacts and from shallow-only learning. The protocol therefore starts from per-layer diagnostics and a frozen-blocks baseline before reading any aggregate cosine or final accuracy as evidence about deep credit assignment. We first show the walk-back on a standard audit, then isolate the two failure modes, and finally state the reporting protocol that future FA papers should satisfy.
\section{Audit: Standard Reporting Walks Back Nothing}
\label{sec:audit}