From 825d973428450cb24d8cccc8c2604235ef974b7c Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Tue, 24 Mar 2026 20:07:03 -0500 Subject: Add Phase 6: snapshot exploitability reveals local update rule is the bottleneck MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 6A: Better credit is ANTI-CORRELATED with loss decrease on fixed snapshot. DFA (Gamma=0.01) → dL=-0.0001 (only method that decreases loss) Vec_M4 (Gamma=0.38) → dL=+0.057 (increases loss most) Oracle BP (Gamma=1.0) → dL=+0.011 (still increases loss) Phase 6C: Target-shift rule reduces damage but cannot make non-DFA credits productive. The inner-product surrogate is fundamentally mismatched with directional credit. Conclusion: Case B — the primary bottleneck is the local update paradigm itself, not the credit estimator quality or tracking/co-adaptation. Co-Authored-By: Claude Opus 4.6 (1M context) --- NOTE.md | 46 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) (limited to 'NOTE.md') diff --git a/NOTE.md b/NOTE.md index 36e0be9..6242a41 100644 --- a/NOTE.md +++ b/NOTE.md @@ -5,7 +5,7 @@ - **pilot**: Controlled iteration (commits 0b9ebb2, 7baf7ae) - **frozen**: Code at commit 0b9ebb2 for all reported results -## Status: PHASE 5 VECTOR FIELD AUDIT + TRANSFER COMPLETE +## Status: PHASE 6 EXPLOITABILITY DISSECTION COMPLETE --- @@ -329,3 +329,47 @@ the signal. - `vector_audit_full/`: Phase 5A full 3-seed audit - `frozen_cifar_vec/`: Phase 5B frozen CIFAR vector transfer - `online_vec_pilot/`: Phase 5C online CIFAR vector pilot + +--- + +## Phase 6: Exploitability Dissection + +### Phase 6A: Snapshot Exploitability + +**Setup**: BP-trained CIFAR snapshot (L=4, d=256, 61.9% acc). +Offline-trained estimators. k-step local updates with real loss measurement. + +**CRITICAL FINDING: Better credit → worse loss decrease.** + +| Credit | Gamma | rho | dL_5step (inner_product) | +|--------|-------|-----|-------------------------| +| DFA | 0.009 | -0.023 | **-0.0001** (only negative!) | +| ScalarCB | 0.122 | 0.090 | +0.042 | +| Vec_M4 | 0.378 | 0.411 | +0.057 | +| Oracle BP | 1.000 | 0.998 | +0.011 | + +Credit quality is ANTI-CORRELATED with loss decrease. +DFA (worst credit) is the only method not increasing loss. + +### Phase 6C: Local Update Rule Swap + +Tested target-shift (`h_target = h_{l+1} - eta * a_norm`) at eta in {0.01, 0.1, 0.3, 1.0}. + +Target-shift reduces damage (Vec dL: +0.057 → +0.002 at eta=0.1) but never achieves +negative DeltaLoss for any non-DFA credit. Cosine rule produces near-zero effects. + +### Root Cause + +The inner-product surrogate `` is not a valid proxy for global loss minimization. +The gradient of this surrogate w.r.t. block parameters ≠ gradient of global loss w.r.t. same parameters. +A BP-trained snapshot is at a minimum reachable only by full BP; local updates systematically push uphill. + +DFA works because its credits are weak enough to produce near-zero updates, effectively doing nothing. + +### This is Case B from the diagnostic logic tree: +Better credit does NOT lead to better snapshot loss decrease. +**The primary bottleneck is the local update rule itself, not the estimator or tracking.** + +### Experiment IDs (Phase 6) +- `snapshot_exploit/`: Phase 6A snapshot exploitability +- `update_swap/`: Phase 6C local update rule comparison -- cgit v1.2.3