diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-26 08:37:39 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-26 08:37:39 -0500 |
| commit | ef4aed70130e2212b4ed1cb7212e2ea6c7c7adb2 (patch) | |
| tree | ad9f128753350ec4f430f77baa018189e4a9d4be /report_explore | |
| parent | 05ccd23154d1e9d090178b9d4d5f2c821711e784 (diff) | |
Add Phase 10A: no prefit threshold — even random Vec blend beats DFA by +1.3%
E_prefit=0 (random Vec) + blend(0.75): 32.4% vs DFA 31.1% (+1.3%)
E_prefit=15: 32.3% (+1.2%)
E_prefit=60: 32.5% (+1.4%)
Frozen Gamma/rho near zero at all prefit levels. The Phase 9A success was NOT
from Vec learning useful credit — it was from the blend mechanism itself providing
regularization/diversification over pure DFA.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'report_explore')
| -rw-r--r-- | report_explore/MEMO_10A_prefit_threshold.md | 30 |
1 files changed, 30 insertions, 0 deletions
diff --git a/report_explore/MEMO_10A_prefit_threshold.md b/report_explore/MEMO_10A_prefit_threshold.md new file mode 100644 index 0000000..a2da91d --- /dev/null +++ b/report_explore/MEMO_10A_prefit_threshold.md @@ -0,0 +1,30 @@ +# Phase 10A Memo: Prefit Threshold Curve + +**Date**: 2026-03-26 +**Config**: CIFAR-10, L=4, d=256, t0=5, blend_075, seed=42 + +## Question +How much offline prefit does Vec need before blend handoff helps? + +## Answer: NONE. Even random Vec with blend(0.75) outperforms DFA by +1.3%. + +| E_prefit | Gamma_frozen | rho_frozen | final acc | diff vs DFA | +|----------|-------------|-----------|-----------|-------------| +| 0 (random) | -0.005 | 0.014 | 0.324 | **+1.3%** | +| 15 | 0.002 | 0.011 | 0.323 | **+1.2%** | +| 60 | -0.001 | -0.009 | 0.325 | **+1.4%** | + +**This is Case C**: very weak (or zero) prefit suffices for blend to beat DFA. + +## Critical Reinterpretation + +The Phase 9A success was NOT due to Vec learning useful credit. The frozen Gamma/rho are near zero at all prefit levels. The benefit comes from **blending DFA with any additional signal** — even random noise through a VectorCreditNet provides a regularization/diversification effect that improves over pure DFA. + +This means: +1. The "cold-start paradox" narrative was partially wrong — Vec doesn't need to be good to help +2. The blend mechanism itself is the active ingredient, not Vec's credit quality +3. Phase 9A's +1.5% was not evidence that "Vec credit is useful online" — it was evidence that "blended updates regularize better than pure DFA" + +## Implication + +The next question is: is this just a regularization artifact (any noise helps), or does Vec's structure matter? This should be tested by comparing blend with random Vec vs blend with random noise of same magnitude. |
