diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 20:51:04 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 20:51:04 -0500 |
| commit | 5995929511404ba3e0b8b4f1bfef69dbf291c7a9 (patch) | |
| tree | 78ed5677f26799533511ff69eb0262dc52ca579d /paper | |
| parent | 29c2396ee6480e94d4543cb603587a4cc7b640cd (diff) | |
paper v2.37: §7 add 'Open questions and concrete next experiments'
§7 currently has only the Scope/limits/recommendation paragraph.
Adding a second paragraph that explicitly flags the Mode 2 → Mode 1
hypothesis status as an open question and proposes two concrete
falsification tests, plus a wider-scope replication path.
The new paragraph:
1. Acknowledges the Mode 2 → Mode 1 causal reading is a hypothesis,
not a theorem, and that the parallel-failure reading is also
formally consistent with the data.
2. Proposes a *direct* test: measure per-block forward-state-change
content along the training trajectory and check whether per-block
loss decrease tracks per-block credit usefulness more tightly than
per-block cosine.
3. Proposes a *falsification* test for the downstream-of-Mode-2 reading:
substitute the random B_l with a high-quality credit signal (sparse,
learned, or weight-transport-restored à la Akrout 2019) at fixed
‖f_l‖ and check whether Mode 1 activation growth still appears. If
yes, Mode 1 is NOT downstream of Mode 2.
4. Notes the wider-scope replication path: CIFAR-100, Tiny-ImageNet,
architectures outside ResMLP/ViT-Mini, with a pointer to Appendix A
as the structured configuration entry point.
This explicitly answers the reviewer question "what would falsify
your hypothesis?" without overclaiming. It positions the paper as
honest about open questions and points at concrete next steps.
Page count: 20 (unchanged) — the paragraph fit within the existing
slack.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'paper')
| -rw-r--r-- | paper/main.pdf | bin | 535423 -> 537138 bytes | |||
| -rw-r--r-- | paper/main.tex | 2 |
2 files changed, 2 insertions, 0 deletions
diff --git a/paper/main.pdf b/paper/main.pdf Binary files differindex b65701f..30f1465 100644 --- a/paper/main.pdf +++ b/paper/main.pdf diff --git a/paper/main.tex b/paper/main.tex index 4650898..a63d0b5 100644 --- a/paper/main.tex +++ b/paper/main.tex @@ -197,6 +197,8 @@ Diag. & Measurement & Default threshold & Role \\ \paragraph{Scope, limits, and reporting recommendation.} \looseness=-2 Our claim is about evidence, not impossibility: we show that current FA evaluation practice can misread what happened, not that FA cannot work in deep networks. DFA, SB, and CB all pass status-quo reporting (Table~\ref{tab:main_audit}) but fail the protocol's deep checks, and the Figure~\ref{fig:penalty_rescue} penalty partially rescues credit signal rather than validating headlines. Our strongest claim is scoped to $d{=}256/512$ pre-LayerNorm ResMLPs and ViT-Mini, where both Mode~1 diagnostics fire; the no-terminal-LN ResMLP ablation establishes terminal LayerNorm as causally necessary for diagnostic~(b) on residual ResMLP and (with the BatchNorm CNN) shows that activation growth can persist without gradient-floor collapse; the dataset is CIFAR-10; and the BP-plus-penalty comparison is a lower bound, not a full decomposition. In the evaluation-methodology line of \citet{jordan2020evaluating,obray2022evaluation,paleka2026pitfalls}, FA papers should report BP-reference validity, layerwise credit quality, and a frozen-blocks depth-utilization baseline as separate axes, not a single headline. +\paragraph{Open questions and concrete next experiments.} The mechanism story in Section~\ref{sec:mode2} treats Mode~1 as a plausible downstream symptom of Mode~2 rather than a parallel, independently destructive failure, but the audit data is also formally consistent with a fully parallel reading. A direct test would measure per-block forward-state-change content along the training trajectory and check whether per-block decrease in test loss tracks per-block credit usefulness (e.g.\ nudging-test loss change) more tightly than it tracks per-block angular agreement with the BP gradient; a complementary test would substitute the random feedback $B_l$ with a high-quality credit signal (sparse, learned to predict the BP gradient, or weight-transport-restored \`a la \citet{akrout2019deep}) at fixed $\|f_l\|$ and check whether activation growth still appears, which would falsify the Mode~2~$\to$~Mode~1 reading by exhibiting Mode~1 in the absence of Mode~2. Beyond the mechanism question, a wider-scope replication would extend the same audit to additional datasets (CIFAR-100, Tiny-ImageNet) and architectures outside the residual ResMLP / ViT-Mini family, which would calibrate how broadly the protocol's binary detectors generalize past the audited regime; the protocol code in Appendix~\ref{app:reference_impl} is structured to make these extensions a configuration change rather than a new experimental design. + \begin{thebibliography}{10} \bibitem[Paleka et~al.(2026)Paleka, Goel, Geiping, and Tramèr]{paleka2026pitfalls} |
