From 3ec9a5cd63b4578999d89b49f5223024a1acb723 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Wed, 25 Mar 2026 14:23:13 -0500 Subject: =?UTF-8?q?Add=20Phase=208:=20schedule=20timing=20test=20=E2=80=94?= =?UTF-8?q?=20online=20co-learning=20is=20the=20remaining=20bottleneck?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Vec_only_from_0: 15.4% (cold-start failure, can't learn credit on random features) DFA_only: 31.2% (remains best non-BP method) DFA_then_Vec_T20: 12.9% (switching to Vec destroys DFA-built features) Vec_T5_then_DFA: 26.6% (partial recovery but still worse than pure DFA) Phase 7A's early-window finding doesn't transfer: it required offline-trained Vec on frozen features. Online Vec estimator faces cold-start paradox — needs structured features to learn credit, but structured features need good credit to form. Co-Authored-By: Claude Opus 4.6 (1M context) --- NOTE.md | 37 ++++++++++++++++++++++++++++++++++++- 1 file changed, 36 insertions(+), 1 deletion(-) (limited to 'NOTE.md') diff --git a/NOTE.md b/NOTE.md index 892cf1e..74b75f3 100644 --- a/NOTE.md +++ b/NOTE.md @@ -5,7 +5,7 @@ - **pilot**: Controlled iteration (commits 0b9ebb2, 7baf7ae) - **frozen**: Code at commit 0b9ebb2 for all reported results -## Status: PHASE 7A SNAPSHOT TIME SWEEP — EARLY SNAPSHOTS SHOW POSITIVE TRANSFER +## Status: PHASE 8 SCHEDULE TIMING — ONLINE CO-LEARNING IS THE REMAINING BOTTLENECK --- @@ -449,3 +449,38 @@ Credit bridge should be used from epoch 0. ### Experiment IDs (Phase 7) - `snapshot_time/`: Phase 7A snapshot time sweep with BP checkpoints + +--- + +## Phase 8: Schedule Timing Hypothesis Test + +**Setup**: CIFAR-10, L=4, d=256, 100 epochs, seed=42 + +| Schedule | acc@5 | acc@20 | final | +|----------|-------|--------|-------| +| DFA_only | **0.297** | **0.308** | **0.312** | +| Vec_only_from_0 | 0.135 | 0.151 | 0.154 | +| Vec_T5_then_DFA | 0.135 | 0.213 | 0.266 | +| DFA_T20_then_Vec | 0.297 | 0.308 | 0.129 | + +**Phase 7A's timing hypothesis does NOT transfer to online training.** + +Vec from epoch 0 gets stuck at 15% (near chance). The online Vec estimator +starts from random initialization and cannot learn useful credit fast enough +when the forward net is also random (cold-start paradox). + +DFA alone remains the best non-BP method (31.2%). + +### The cold-start paradox: +Vec credit is most useful on early features, but Vec can only learn useful credit +from features with structure. DFA provides structure slowly, but by the time Vec +is ready, the early window is closed. + +### Project conclusion at this point: +- Vec estimator WORKS (synthetic + frozen CIFAR) +- Local surrogate CAN exploit it (same-batch, Phase 6.5A) +- Early snapshots show generalization (Phase 7A, offline-trained Vec) +- But online co-learning of estimator + forward net is unsolved (cold-start) + +### Experiment IDs (Phase 8) +- `schedule_timing/`: Phase 8 schedule comparison -- cgit v1.2.3