diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 10:22:20 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 10:22:20 -0500 |
| commit | 15845f0226fe5e1f64ec2ab6bd0253d59ae813ce (patch) | |
| tree | f4a95d5a1a9f51889a17c6d274a629b32b97ba2d /experiments/periodic_refit.py | |
| parent | 921d3dec7daa67f16194d7eca7712c4903ce6f1d (diff) | |
§3 fix: correctly distinguish DFA/SB/CB local credit vectors
Previous §3 ¶1 wrote the local loss as -<f_l, B_l^T e_T> as if it applied
to DFA, SB, and CB all three. But that's only DFA's form. SB and CB use
learned bridge networks to derive credit:
- DFA: a_l = B_l^T e_T (fixed random projection)
- State Bridge: a_l = gradient of CE(head(LN(G_psi(h_l, t_l, s))), y)
where G_psi is a learned state predictor of h_L
- Credit Bridge: a_l = gradient of learned value net V(h_l, t_l, s)
The fix correctly writes the shared local loss form -<f_l, a_l> and
defines a_l for each method in-line. This also serves as the first
definition of SB and CB in the paper (previously they were named in
Table 1 without being defined).
Main content still ends at p9 (just slightly before the bottom margin
now); references span p9-p10 but are not counted against the 9-page
content budget. Total 17 pages.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'experiments/periodic_refit.py')
0 files changed, 0 insertions, 0 deletions
