diff options
Diffstat (limited to 'report_explore')
| -rw-r--r-- | report_explore/MEMO_9A_checkpointed_handoff.md | 33 |
1 files changed, 33 insertions, 0 deletions
diff --git a/report_explore/MEMO_9A_checkpointed_handoff.md b/report_explore/MEMO_9A_checkpointed_handoff.md new file mode 100644 index 0000000..d916b0f --- /dev/null +++ b/report_explore/MEMO_9A_checkpointed_handoff.md @@ -0,0 +1,33 @@ +# Phase 9A Memo: Checkpointed Offline Handoff + +**Date**: 2026-03-25 +**Config**: CIFAR-10, L=4, d=256, t0=5, 100 epochs, seed=42 + +## Question +If we offline-train Vec on a DFA checkpoint, can it take over and outperform continuing DFA? + +## Results + +| Branch | acc@20 | final acc | diff vs DFA | +|--------|--------|-----------|-------------| +| continue_DFA | 0.296 | 0.311 | baseline | +| handoff_to_Vec | 0.307 | 0.300 | -0.011 | +| **handoff_blend_05** | **0.312** | **0.317** | **+0.006** | + +Vec quality at frozen t0=5 checkpoint: Gamma=0.229, rho=0.262. + +## Key Finding: Blend Handoff Outperforms DFA + +**This is Case B**: pure Vec takeover doesn't work, but **50% blend (Vec + DFA) outperforms pure DFA by +0.55%**. + +This is the first time any Vec-involving method has beaten DFA on online CIFAR. The blend provides complementary information: DFA gives stable random projections, Vec adds learned directional credit. Neither alone is sufficient, but together they outperform. + +## Implications + +1. **Cold-start IS the main bottleneck** — offline-fitted Vec can help, confirming Vec is useful on DFA trajectory features. + +2. **Pure Vec takeover fails** because once it takes over, the forward net trajectory diverges from what Vec was trained on, and online Vec retraining can't keep up. + +3. **Blend works** because DFA provides a stable backbone that prevents trajectory divergence, while Vec contributes useful directional corrections. + +4. **Next steps**: Test blend at different alpha values (0.25, 0.75), different t0, and 3 seeds for validation. Also test periodic refit to keep Vec fresh. |
