diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-25 10:23:19 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-25 10:23:19 -0500 |
| commit | ef5bd494087a46ee80d8bc17796074efdae81ff4 (patch) | |
| tree | 3104d9b8c0a07a38961aee54057125e45941db88 /NOTE.md | |
| parent | 7e01fbc0ce871857c1e1879ed0d3559e8bfae7c7 (diff) | |
Add Phase 7A: snapshot time sweep shows early snapshots have positive held-out transfer
At epoch 5 (acc=49%), Vec_M4 5-step: dL_held=-0.005 (PUR=0.70)
Oracle BP 5-step: dL_held=-0.009 (PUR=1.05)
DFA 5-step: dL_held=+0.003 (always hurts held-out)
By epoch 20, generalization window closes. Held-out failure is late-snapshot artifact.
Better credit → lower update variance (Vec=0.8 vs DFA=40), not higher.
Key implication: DFA warmup delays credit bridge past its useful window.
Credit should be used from epoch 0, not after 20% warmup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'NOTE.md')
| -rw-r--r-- | NOTE.md | 33 |
1 files changed, 32 insertions, 1 deletions
@@ -5,7 +5,7 @@ - **pilot**: Controlled iteration (commits 0b9ebb2, 7baf7ae) - **frozen**: Code at commit 0b9ebb2 for all reported results -## Status: PHASE 6.5 PROTOCOL AUDIT — PHASE 6A CONCLUSION REVISED +## Status: PHASE 7A SNAPSHOT TIME SWEEP — EARLY SNAPSHOTS SHOW POSITIVE TRANSFER --- @@ -418,3 +418,34 @@ gradient noise) could make better credit usable. ### Experiment IDs (Phase 6.5) - `exploit_linesearch/`: Phase 6.5A smoke test (Oracle + Vec, last1, raw) - `exploit_linesearch_full/`: Phase 6.5A full sweep (all methods, ranges, norm modes) + +--- + +## Phase 7A: Snapshot Time Sweep + +**Setup**: BP snapshots at epoch {5, 20, 100} (acc 0.49/0.57/0.62). +Train Vec_M4 on each frozen snapshot. Test 1-step and 5-step with raw credit, last-block-only. + +**KEY FINDING: Held-out failure is primarily a LATE-SNAPSHOT artifact.** + +5-step DeltaLoss held-out: + +| Epoch | DFA dL_held | Vec dL_held | Oracle dL_held | Vec PUR | +|-------|-------------|-------------|----------------|---------| +| **5** | +0.003 | **-0.005** | **-0.009** | **0.70** | +| 20 | +0.001 | +0.002 | +0.000 | -3.87 | +| 100 | +0.000 | +0.001 | -0.001 | -1.01 | + +At epoch 5: Vec decreases held-out loss (PUR=0.70), Oracle too (PUR=1.05). +DFA INCREASES held-out at all snapshots. + +By epoch 20 the generalization window closes. + +**Better credit produces MORE consistent updates** (Vec variance=0.8 vs DFA variance=40). +The problem is not batch-specificity but snapshot timing: credit is useful early, useless late. + +**Implication**: The DFA warmup (which delays credit bridge to epoch ~20) is counterproductive. +Credit bridge should be used from epoch 0. + +### Experiment IDs (Phase 7) +- `snapshot_time/`: Phase 7A snapshot time sweep with BP checkpoints |
