<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/report_explore, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Add Phase 10A.6: gain requires trainable depth-aware aux, not semantic credit</title>
<updated>2026-03-27T03:07:35+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-27T03:07:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=b4e3cbeae6cb4cf4a4b69b84a475afcd7d7e9dbe'/>
<id>b4e3cbeae6cb4cf4a4b69b84a475afcd7d7e9dbe</id>
<content type='text'>
9-branch dissection results:
- zero_target crashes (-9.1%): aux must output non-zero
- constant_input neutral (+0.0%): needs at least depth info
- time_only works (+1.0%): h_l not needed, just depth index
- shuffled/fresh_random work (+1.3-1.4%): no semantic content needed
- prefit60_trainable ≈ random_trainable: prefit adds nothing
- All frozen branches crash: trainability is essential

Mechanism: depth-aware trainable auxiliary perturbation that diversifies
block-local updates. Not semantic credit, not pure trainability.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
9-branch dissection results:
- zero_target crashes (-9.1%): aux must output non-zero
- constant_input neutral (+0.0%): needs at least depth info
- time_only works (+1.0%): h_l not needed, just depth index
- shuffled/fresh_random work (+1.3-1.4%): no semantic content needed
- prefit60_trainable ≈ random_trainable: prefit adds nothing
- All frozen branches crash: trainability is essential

Mechanism: depth-aware trainable auxiliary perturbation that diversifies
block-local updates. Not semantic credit, not pure trainability.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 10A.5: blend gain is implicit regularization, not learned credit</title>
<updated>2026-03-26T21:27:53+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-26T21:27:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=610e1169e19378cccd2d9b92a588c24dca7f3df7'/>
<id>610e1169e19378cccd2d9b92a588c24dca7f3df7</id>
<content type='text'>
Dissection of 6 branches from same DFA checkpoint:
- blend_random_frozen: 12.6% (CATASTROPHIC — frozen noise destroys training)
- blend_random_trainable: 32.2% (+1.2% — trainable network helps)
- blend_shuffled_trainable: 32.5% (+1.4% — even wrong targets work!)
- blend_gaussian_noise: 30.8% (neutral)
- scaled_DFA_norm_match: 31.0% (neutral)

The gain comes from implicit regularization through a co-optimized auxiliary
network, NOT from learned credit quality. Phase 9A's +1.5% was an optimization
dynamics effect, not evidence of useful credit assignment.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Dissection of 6 branches from same DFA checkpoint:
- blend_random_frozen: 12.6% (CATASTROPHIC — frozen noise destroys training)
- blend_random_trainable: 32.2% (+1.2% — trainable network helps)
- blend_shuffled_trainable: 32.5% (+1.4% — even wrong targets work!)
- blend_gaussian_noise: 30.8% (neutral)
- scaled_DFA_norm_match: 31.0% (neutral)

The gain comes from implicit regularization through a co-optimized auxiliary
network, NOT from learned credit quality. Phase 9A's +1.5% was an optimization
dynamics effect, not evidence of useful credit assignment.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 10A: no prefit threshold — even random Vec blend beats DFA by +1.3%</title>
<updated>2026-03-26T13:37:39+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-26T13:37:39+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=ef4aed70130e2212b4ed1cb7212e2ea6c7c7adb2'/>
<id>ef4aed70130e2212b4ed1cb7212e2ea6c7c7adb2</id>
<content type='text'>
E_prefit=0 (random Vec) + blend(0.75): 32.4% vs DFA 31.1% (+1.3%)
E_prefit=15: 32.3% (+1.2%)
E_prefit=60: 32.5% (+1.4%)

Frozen Gamma/rho near zero at all prefit levels. The Phase 9A success was NOT
from Vec learning useful credit — it was from the blend mechanism itself providing
regularization/diversification over pure DFA.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
E_prefit=0 (random Vec) + blend(0.75): 32.4% vs DFA 31.1% (+1.3%)
E_prefit=15: 32.3% (+1.2%)
E_prefit=60: 32.5% (+1.4%)

Frozen Gamma/rho near zero at all prefit levels. The Phase 9A success was NOT
from Vec learning useful credit — it was from the blend mechanism itself providing
regularization/diversification over pure DFA.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 9A: checkpointed handoff — blend(Vec+DFA) outperforms pure DFA</title>
<updated>2026-03-25T21:20:53+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-25T21:20:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=5a3b20d627eca65612f598c1ba5807d5d2df029a'/>
<id>5a3b20d627eca65612f598c1ba5807d5d2df029a</id>
<content type='text'>
First positive online result: 50% blend of offline-fitted Vec + DFA gives 31.7%
vs 31.1% for pure DFA (+0.55%). This is Case B: pure Vec handoff fails (-1.1%)
but blend works because DFA stabilizes trajectory while Vec adds directional credit.

Offline-fitted Vec at DFA epoch-5 checkpoint: Gamma=0.229, rho=0.262.
Cold-start confirmed as main bottleneck — Vec IS useful on DFA trajectory features.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
First positive online result: 50% blend of offline-fitted Vec + DFA gives 31.7%
vs 31.1% for pure DFA (+0.55%). This is Case B: pure Vec handoff fails (-1.1%)
but blend works because DFA stabilizes trajectory while Vec adds directional credit.

Offline-fitted Vec at DFA epoch-5 checkpoint: Gamma=0.229, rho=0.262.
Cold-start confirmed as main bottleneck — Vec IS useful on DFA trajectory features.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 8: schedule timing test — online co-learning is the remaining bottleneck</title>
<updated>2026-03-25T19:23:13+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-25T19:23:13+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=3ec9a5cd63b4578999d89b49f5223024a1acb723'/>
<id>3ec9a5cd63b4578999d89b49f5223024a1acb723</id>
<content type='text'>
Vec_only_from_0: 15.4% (cold-start failure, can't learn credit on random features)
DFA_only: 31.2% (remains best non-BP method)
DFA_then_Vec_T20: 12.9% (switching to Vec destroys DFA-built features)
Vec_T5_then_DFA: 26.6% (partial recovery but still worse than pure DFA)

Phase 7A's early-window finding doesn't transfer: it required offline-trained Vec
on frozen features. Online Vec estimator faces cold-start paradox — needs structured
features to learn credit, but structured features need good credit to form.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Vec_only_from_0: 15.4% (cold-start failure, can't learn credit on random features)
DFA_only: 31.2% (remains best non-BP method)
DFA_then_Vec_T20: 12.9% (switching to Vec destroys DFA-built features)
Vec_T5_then_DFA: 26.6% (partial recovery but still worse than pure DFA)

Phase 7A's early-window finding doesn't transfer: it required offline-trained Vec
on frozen features. Online Vec estimator faces cold-start paradox — needs structured
features to learn credit, but structured features need good credit to form.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 7A: snapshot time sweep shows early snapshots have positive held-out transfer</title>
<updated>2026-03-25T15:23:19+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-25T15:23:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=ef5bd494087a46ee80d8bc17796074efdae81ff4'/>
<id>ef5bd494087a46ee80d8bc17796074efdae81ff4</id>
<content type='text'>
At epoch 5 (acc=49%), Vec_M4 5-step: dL_held=-0.005 (PUR=0.70)
  Oracle BP 5-step: dL_held=-0.009 (PUR=1.05)
  DFA 5-step: dL_held=+0.003 (always hurts held-out)

By epoch 20, generalization window closes. Held-out failure is late-snapshot artifact.
Better credit → lower update variance (Vec=0.8 vs DFA=40), not higher.

Key implication: DFA warmup delays credit bridge past its useful window.
Credit should be used from epoch 0, not after 20% warmup.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
At epoch 5 (acc=49%), Vec_M4 5-step: dL_held=-0.005 (PUR=0.70)
  Oracle BP 5-step: dL_held=-0.009 (PUR=1.05)
  DFA 5-step: dL_held=+0.003 (always hurts held-out)

By epoch 20, generalization window closes. Held-out failure is late-snapshot artifact.
Better credit → lower update variance (Vec=0.8 vs DFA=40), not higher.

Key implication: DFA warmup delays credit bridge past its useful window.
Credit should be used from epoch 0, not after 20% warmup.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 6.5A: same-batch linesearch REVISES Phase 6A conclusion</title>
<updated>2026-03-25T13:22:04+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-25T13:22:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=7e01fbc0ce871857c1e1879ed0d3559e8bfae7c7'/>
<id>7e01fbc0ce871857c1e1879ed0d3559e8bfae7c7</id>
<content type='text'>
Phase 6A's "better credit → worse loss" was a protocol artifact caused by:
1. Credit normalization (inflated DFA, suppressed Vec magnitude ordering)
2. Held-out evaluation (measured generalization failure, not exploitability)
3. Gradient clamping

With strict same-batch evaluation:
- Oracle BP: dL_same = -0.406 (strongest descent)
- Vec_M4: dL_same = -0.135
- ScalarCB: dL_same = -0.025
- DFA: dL_same = -0.003
Same-batch loss decrease is MONOTONIC with credit quality.

But held-out loss INCREASES for all non-DFA methods (Case D: overfitting).
The bottleneck is batch-level generalization, not surrogate exploitability.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Phase 6A's "better credit → worse loss" was a protocol artifact caused by:
1. Credit normalization (inflated DFA, suppressed Vec magnitude ordering)
2. Held-out evaluation (measured generalization failure, not exploitability)
3. Gradient clamping

With strict same-batch evaluation:
- Oracle BP: dL_same = -0.406 (strongest descent)
- Vec_M4: dL_same = -0.135
- ScalarCB: dL_same = -0.025
- DFA: dL_same = -0.003
Same-batch loss decrease is MONOTONIC with credit quality.

But held-out loss INCREASES for all non-DFA methods (Case D: overfitting).
The bottleneck is batch-level generalization, not surrogate exploitability.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 6: snapshot exploitability reveals local update rule is the bottleneck</title>
<updated>2026-03-25T01:07:03+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-25T01:07:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=825d973428450cb24d8cccc8c2604235ef974b7c'/>
<id>825d973428450cb24d8cccc8c2604235ef974b7c</id>
<content type='text'>
Phase 6A: Better credit is ANTI-CORRELATED with loss decrease on fixed snapshot.
  DFA (Gamma=0.01) → dL=-0.0001 (only method that decreases loss)
  Vec_M4 (Gamma=0.38) → dL=+0.057 (increases loss most)
  Oracle BP (Gamma=1.0) → dL=+0.011 (still increases loss)

Phase 6C: Target-shift rule reduces damage but cannot make non-DFA credits productive.
  The inner-product surrogate &lt;F_l(h), a_l&gt; is fundamentally mismatched with directional credit.

Conclusion: Case B — the primary bottleneck is the local update paradigm itself,
not the credit estimator quality or tracking/co-adaptation.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Phase 6A: Better credit is ANTI-CORRELATED with loss decrease on fixed snapshot.
  DFA (Gamma=0.01) → dL=-0.0001 (only method that decreases loss)
  Vec_M4 (Gamma=0.38) → dL=+0.057 (increases loss most)
  Oracle BP (Gamma=1.0) → dL=+0.011 (still increases loss)

Phase 6C: Target-shift rule reduces damage but cannot make non-DFA credits productive.
  The inner-product surrogate &lt;F_l(h), a_l&gt; is fundamentally mismatched with directional credit.

Conclusion: Case B — the primary bottleneck is the local update paradigm itself,
not the credit estimator quality or tracking/co-adaptation.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 5: vector field audit, frozen CIFAR transfer, online pilot</title>
<updated>2026-03-24T23:03:55+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-24T23:03:55+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=5550e2cac45758e579810ae36bf716a0b819cebc'/>
<id>5550e2cac45758e579810ae36bf716a0b819cebc</id>
<content type='text'>
Phase 5A: Audit passes — shuffle control collapses, gains are real
Phase 5B: Transfer SUCCESS — vec_M4 beats scalar CB by +0.25 Gamma, +0.31 rho on frozen CIFAR
Phase 5C: Online FAILURE — vec does worse than scalar CB online despite better frozen credit
Core finding: bottleneck is in local surrogate / co-adaptation, not estimator quality

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Phase 5A: Audit passes — shuffle control collapses, gains are real
Phase 5B: Transfer SUCCESS — vec_M4 beats scalar CB by +0.25 Gamma, +0.31 rho on frozen CIFAR
Phase 5C: Online FAILURE — vec does worse than scalar CB online despite better frozen credit
Core finding: bottleneck is in local surrogate / co-adaptation, not estimator quality

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add Phase 4 diagnostic dissection: frozen credit recovery, online shallow scan, vector field pilot</title>
<updated>2026-03-24T17:47:19+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-03-24T17:47:19+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=3d17cbad98f320905c52509c7f18691eab8bf2a0'/>
<id>3d17cbad98f320905c52509c7f18691eab8bf2a0</id>
<content type='text'>
Key findings:
- Frozen CIFAR: estimators CAN recover credit (SB best, CB 20x &gt; DFA)
- Online shallow: cb_eT wr=0.2 tgw=1.0 achieves S1&gt;0, S2 marginal
- Vector credit field: 0.91-0.96 Gamma/rho on synthetic (vs 0.34 scalar CB)
- Direct vector field avoids scalar V curvature problem entirely

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Key findings:
- Frozen CIFAR: estimators CAN recover credit (SB best, CB 20x &gt; DFA)
- Online shallow: cb_eT wr=0.2 tgw=1.0 achieves S1&gt;0, S2 marginal
- Vector credit field: 0.91-0.96 Gamma/rho on synthetic (vs 0.34 scalar CB)
- Direct vector field avoids scalar V curvature problem entirely

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
