diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 15:13:22 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 15:13:22 -0500 |
| commit | da988679e29d99e4bc6b2788b2bd873d9cd5cde0 (patch) | |
| tree | 5724b21efa111e6075247c2b154b67dde117a3ac /results/confirmatory/clean_sparsity/synth_bp_s456_a0.5_L8.json | |
| parent | 0abfa8e96afffd75cdb6a985603dceb55a284427 (diff) | |
SB/CB probe reframe + compression + Figure 5 to appendix (user-approved)
User pushed back on SB/CB being treated as 'audited FA methods' because
they're our own constructions. Reframe them as diagnostic probes built on
two prior-literature assumptions (state=credit and credit=performance).
§1 intro: add 1 sentence clarifying BP/EP/DFA are established baselines and
SB/CB are probes constructed in this paper.
§2 ¶2 new opening paragraph (before 'By the field's usual criteria'):
- SB/CB are probes, not prior FA variants
- Each directly learns a target from a prior-literature view
- SB: target-propagation view (Bengio 2014, Lee 2015) — auxiliary G_ψ(h_l,t_l,s)
predicts h_L via MSE; a_l^SB = ∇_{h_l} CE(W_out LN(G_ψ(h_l,t_l,s)), y)
- CB: synthetic-gradient view (Jaderberg 2017) — auxiliary V_φ(h_l,t_l,s)
trained via bridge residual; a_l^CB = ∇_{h_l} V_φ(h_l,t_l,s)
- Both auxiliaries trained on detached hidden states
- Role: populate different points in the (angular alignment, functional
usefulness) plane, making the §4 cos-vs-acc dissociation visible
Bibliography: added Bengio 2014 (arXiv 1407.7906), Lee et al. 2015 (ECML
PKDD), Jaderberg et al. 2017 (ICML) — all verified via WebSearch.
Page budget: the ~180-word §2 addition pushed §7 onto p10. Recovered
space by:
(a) compressing §2 ¶1 opening
(b) compressing §3 ¶2 falsification chain (tighter number formatting)
(c) compressing §6 ¶3 asymmetry paragraph
(d) merging §7 into a single paragraph (was 3)
(e) moving Figure 5 (decision_utility) from §6 main text to a floated
appendix figure in Appendix D (the 'all seven validations' appendix,
which is conceptually related). The decision-utility ablation's
headline ('accuracy+Γ walks back 0/5, full protocol walks back 3/5')
is already in §6 prose so the figure functions as supporting backup.
Result: main content is strictly 9 pages (§1-§7 on p1-p9). References and
appendices on p10+. Total 18 pages, 0 overfull hbox.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'results/confirmatory/clean_sparsity/synth_bp_s456_a0.5_L8.json')
0 files changed, 0 insertions, 0 deletions
