From 7e01fbc0ce871857c1e1879ed0d3559e8bfae7c7 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Wed, 25 Mar 2026 08:22:04 -0500 Subject: Add Phase 6.5A: same-batch linesearch REVISES Phase 6A conclusion MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Phase 6A's "better credit → worse loss" was a protocol artifact caused by: 1. Credit normalization (inflated DFA, suppressed Vec magnitude ordering) 2. Held-out evaluation (measured generalization failure, not exploitability) 3. Gradient clamping With strict same-batch evaluation: - Oracle BP: dL_same = -0.406 (strongest descent) - Vec_M4: dL_same = -0.135 - ScalarCB: dL_same = -0.025 - DFA: dL_same = -0.003 Same-batch loss decrease is MONOTONIC with credit quality. But held-out loss INCREASES for all non-DFA methods (Case D: overfitting). The bottleneck is batch-level generalization, not surrogate exploitability. Co-Authored-By: Claude Opus 4.6 (1M context) --- NOTE.md | 47 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 46 insertions(+), 1 deletion(-) (limited to 'NOTE.md') diff --git a/NOTE.md b/NOTE.md index 6242a41..a57fd30 100644 --- a/NOTE.md +++ b/NOTE.md @@ -5,7 +5,7 @@ - **pilot**: Controlled iteration (commits 0b9ebb2, 7baf7ae) - **frozen**: Code at commit 0b9ebb2 for all reported results -## Status: PHASE 6 EXPLOITABILITY DISSECTION COMPLETE +## Status: PHASE 6.5 PROTOCOL AUDIT — PHASE 6A CONCLUSION REVISED --- @@ -373,3 +373,48 @@ Better credit does NOT lead to better snapshot loss decrease. ### Experiment IDs (Phase 6) - `snapshot_exploit/`: Phase 6A snapshot exploitability - `update_swap/`: Phase 6C local update rule comparison + +--- + +## Phase 6.5: Protocol Audit (REVISES Phase 6A conclusion) + +### Phase 6.5A: Same-Batch Linesearch + +**CRITICAL REVISION**: Phase 6A's "better credit → worse loss" was a protocol artifact. + +Phase 6A used: normalized credit + held-out evaluation + gradient clamping. +Phase 6.5A uses: raw + norm credit, same-batch + held-out eval, no clamping, eta sweep. + +**With same-batch evaluation, better credit DOES produce more loss decrease:** + +| Method | Gamma | dL_same (norm, all, best eta) | dL_held | +|--------|-------|-------------------------------|---------| +| DFA | 0.01 | -0.003 | +0.004 | +| ScalarCB | 0.12 | -0.025 | +0.027 | +| Vec_M4 | 0.38 | **-0.135** | +0.045 | +| Oracle BP | 1.00 | **-0.406** | +0.094 | + +Same-batch loss decrease is MONOTONIC with credit quality. +But held-out loss INCREASES for all non-DFA methods. + +**This is Case D: the local surrogate exploits credit correctly on training data, +but the update overfits to the batch. Better credit = more effective overfitting.** + +### Key confounds identified in Phase 6A: +1. **Normalization** inflated DFA's weak credits to same magnitude as Vec's +2. **Held-out evaluation** showed generalization failure, not exploitability failure +3. **Gradient clamping** distorted the natural credit quality ordering + +### Raw vs Norm: +- Raw credit: tiny updates (BP grad RMS ≈ 0.00004). Vec raw best dL_same=-0.005 +- Norm credit: amplifies to useful magnitude but also amplifies overfitting + +### Revised diagnosis: +The bottleneck is NOT "surrogate can't exploit credit" (Phase 6A was wrong). +It IS "local surrogate with good credit overfits to mini-batch." +This suggests: regularization of local updates (larger batches, weight decay, +gradient noise) could make better credit usable. + +### Experiment IDs (Phase 6.5) +- `exploit_linesearch/`: Phase 6.5A smoke test (Oracle + Vec, last1, raw) +- `exploit_linesearch_full/`: Phase 6.5A full sweep (all methods, ranges, norm modes) -- cgit v1.2.3