diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-25 16:20:53 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-25 16:20:53 -0500 |
| commit | 5a3b20d627eca65612f598c1ba5807d5d2df029a (patch) | |
| tree | e7f2f697303f738e757db6e93214d880f6c7642a /report_explore | |
| parent | 3ec9a5cd63b4578999d89b49f5223024a1acb723 (diff) | |
Add Phase 9A: checkpointed handoff — blend(Vec+DFA) outperforms pure DFA
First positive online result: 50% blend of offline-fitted Vec + DFA gives 31.7%
vs 31.1% for pure DFA (+0.55%). This is Case B: pure Vec handoff fails (-1.1%)
but blend works because DFA stabilizes trajectory while Vec adds directional credit.
Offline-fitted Vec at DFA epoch-5 checkpoint: Gamma=0.229, rho=0.262.
Cold-start confirmed as main bottleneck — Vec IS useful on DFA trajectory features.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'report_explore')
| -rw-r--r-- | report_explore/MEMO_9A_checkpointed_handoff.md | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/report_explore/MEMO_9A_checkpointed_handoff.md b/report_explore/MEMO_9A_checkpointed_handoff.md new file mode 100644 index 0000000..d916b0f --- /dev/null +++ b/report_explore/MEMO_9A_checkpointed_handoff.md @@ -0,0 +1,33 @@ +# Phase 9A Memo: Checkpointed Offline Handoff + +**Date**: 2026-03-25 +**Config**: CIFAR-10, L=4, d=256, t0=5, 100 epochs, seed=42 + +## Question +If we offline-train Vec on a DFA checkpoint, can it take over and outperform continuing DFA? + +## Results + +| Branch | acc@20 | final acc | diff vs DFA | +|--------|--------|-----------|-------------| +| continue_DFA | 0.296 | 0.311 | baseline | +| handoff_to_Vec | 0.307 | 0.300 | -0.011 | +| **handoff_blend_05** | **0.312** | **0.317** | **+0.006** | + +Vec quality at frozen t0=5 checkpoint: Gamma=0.229, rho=0.262. + +## Key Finding: Blend Handoff Outperforms DFA + +**This is Case B**: pure Vec takeover doesn't work, but **50% blend (Vec + DFA) outperforms pure DFA by +0.55%**. + +This is the first time any Vec-involving method has beaten DFA on online CIFAR. The blend provides complementary information: DFA gives stable random projections, Vec adds learned directional credit. Neither alone is sufficient, but together they outperform. + +## Implications + +1. **Cold-start IS the main bottleneck** — offline-fitted Vec can help, confirming Vec is useful on DFA trajectory features. + +2. **Pure Vec takeover fails** because once it takes over, the forward net trajectory diverges from what Vec was trained on, and online Vec retraining can't keep up. + +3. **Blend works** because DFA provides a stable backbone that prevents trajectory divergence, while Vec contributes useful directional corrections. + +4. **Next steps**: Test blend at different alpha values (0.25, 0.75), different t0, and 3 seeds for validation. Also test periodic refit to keep Vec fresh. |
