report_explore/MEMO_frozen_cifar_recovery.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

# Phase A Memo: Frozen CIFAR Credit Recovery

**Date**: 2026-03-24
**Config**: CIFAR-10, L=4, d=256, BP reference (61.7% test acc), 100 epochs estimator training, seed=42

## Question

On frozen CIFAR representations (BP-trained), can the current credit estimators recover meaningful local credit?

## Results

| Method | mean Gamma | mean rho | mean nudge (eta=0.003) |
|--------|-----------|---------|------------------------|
| DFA (random B) | 0.006 | 0.005 | -0.000022 |
| Scalar CB (s=eT) | 0.115 | 0.125 | -0.000370 |
| Scalar CB (s=deltaL) | 0.070 | 0.062 | -0.000160 |
| **State Bridge (s=eT)** | **0.287** | **0.246** | **-0.000957** |

Per-layer rho shows consistent signal across all layers for SB and CB_eT, strongest at layer 3 (closest to terminal).

## Key Finding: State Bridge Dominates on Frozen BP Features

This is the **opposite** of the synthetic alpha=1.0 result, where CB beat SB.

**Why?** BP-trained features are approximately linear in their local geometry. The Jacobians of BP-trained residual blocks are near-identity (small residual branches). State bridge exploits this quasi-linear structure: its Jacobian-based credit is a decent approximation when the true Jacobian is close to identity.

On the synthetic task at alpha=1.0 (full tanh), the Jacobians are far from identity -- that's where state bridge fails and CB wins. But CIFAR with BP training produces well-conditioned, slowly-varying features where linearity holds locally.

## Implications

1. **The scalar CB parameterization works** -- it recovers credit that is 20x better than DFA on Gamma and rho, and produces meaningful negative nudging. The estimator is NOT fundamentally broken.

2. **But it underperforms state bridge** on frozen BP features. This means the "curvature-vs-value disconnect" identified in the scalar V analysis is real: even with terminal gradient matching, the bridge consistency loss doesn't constrain grad_h V as well as direct state prediction + Jacobian.

3. **CB_eT > CB_deltaL** on frozen CIFAR. The 256-dim deltaL conditioning seems to cause value net overfitting or optimization difficulty. This reverses the synthetic finding (where deltaL won on Gamma). The d/C ratio matters: d=256, C=10 gives 25.6x, while synthetic d=128, C=10 gives 12.8x.

4. **The online training failure is NOT purely an estimator problem.** Both CB variants beat DFA by a wide margin on frozen features (rho: 0.06-0.12 vs 0.005). Yet in online CIFAR training, CB barely beats DFA. The bottleneck must be partly in **co-adaptation**: when the forward net parameters change each epoch, the value net's credit becomes stale, and the local surrogate update may not effectively exploit even correct credit.

## Decision

This result is **POSITIVE** per the Phase A judgment criteria:
- CB mean rho (0.125 for eT) > DFA mean rho (0.005) + 0.02 threshold
- CB mean Gamma (0.115 for eT) > DFA mean Gamma (0.006)

**Proceed to Phase B** (online shallow CIFAR) to investigate whether the frozen-feature signal translates to online training.

**Additional question raised**: Since state bridge is the best estimator on frozen features, should Phase B also include state bridge as a method? The answer is yes -- state bridge's online failure (18.5% at L=12) might be partially a depth problem. At L=4 with BP-like features, it might actually work.