summaryrefslogtreecommitdiff
path: root/report_explore
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-03-25 10:23:19 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-03-25 10:23:19 -0500
commitef5bd494087a46ee80d8bc17796074efdae81ff4 (patch)
tree3104d9b8c0a07a38961aee54057125e45941db88 /report_explore
parent7e01fbc0ce871857c1e1879ed0d3559e8bfae7c7 (diff)
Add Phase 7A: snapshot time sweep shows early snapshots have positive held-out transfer
At epoch 5 (acc=49%), Vec_M4 5-step: dL_held=-0.005 (PUR=0.70) Oracle BP 5-step: dL_held=-0.009 (PUR=1.05) DFA 5-step: dL_held=+0.003 (always hurts held-out) By epoch 20, generalization window closes. Held-out failure is late-snapshot artifact. Better credit → lower update variance (Vec=0.8 vs DFA=40), not higher. Key implication: DFA warmup delays credit bridge past its useful window. Credit should be used from epoch 0, not after 20% warmup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'report_explore')
-rw-r--r--report_explore/MEMO_7A_snapshot_time_sweep.md37
1 files changed, 37 insertions, 0 deletions
diff --git a/report_explore/MEMO_7A_snapshot_time_sweep.md b/report_explore/MEMO_7A_snapshot_time_sweep.md
new file mode 100644
index 0000000..31e4cb2
--- /dev/null
+++ b/report_explore/MEMO_7A_snapshot_time_sweep.md
@@ -0,0 +1,37 @@
+# Phase 7A Memo: Snapshot Time Sweep
+
+**Date**: 2026-03-25
+
+## Question
+Is "same-batch descent + held-out ascent" a late-snapshot artifact, or does it persist across training?
+
+## Answer: Primarily a late-snapshot artifact. Early snapshots show positive held-out transfer.
+
+### 5-step DeltaLoss results (raw credit, last-block-only):
+
+| Epoch | Acc | DFA dL_held | Vec dL_held | Oracle dL_held | Vec PUR_5 |
+|-------|-----|-------------|-------------|----------------|-----------|
+| **5** | 0.49 | +0.003 | **-0.005** | **-0.009** | **0.70** |
+| 20 | 0.57 | +0.001 | +0.002 | +0.000 | -3.87 |
+| 100 | 0.62 | +0.000 | +0.001 | -0.001 | -1.01 |
+
+### Key findings:
+
+1. **At epoch 5, Vec and Oracle both decrease held-out loss**, while DFA increases it. Vec PUR=0.70 means 70% of same-batch improvement transfers to held-out. Oracle PUR=1.05 (>100% transfer).
+
+2. **By epoch 20, the generalization window closes.** All methods show near-zero or positive held-out change.
+
+3. **Better credit → lower update variance.** Vec/Oracle update variance is 50x lower than DFA (0.4-0.8 vs 40-60). Better credit produces MORE consistent cross-batch updates, not less.
+
+4. **DFA never improves held-out at any snapshot.** Its updates are random enough to sometimes decrease same-batch loss but never systematically improve held-out.
+
+## Implications
+
+The "better credit is useless" narrative from Phase 6A/6.5A was wrong on two counts:
+1. Same-batch exploitability works (Phase 6.5A)
+2. Early-snapshot held-out transfer works too (this experiment)
+
+The online training failure is because by the time the warmup phase ends and credit bridge takes over (epoch ~20), the network is already past the "generalization window" where local credit updates are useful. The fix should be: **use credit bridge from the start (no DFA warmup), or switch earlier.**
+
+## Next step recommendation
+Phase 7B (multi-batch averaging) may not be needed given that the held-out failure is a snapshot-timing issue, not a batch-variance issue. Instead, the priority should be testing online training WITH vector credit from epoch 0 (no warmup or very short warmup).