diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-23 18:23:29 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-03-23 18:23:29 -0500 |
| commit | bbb1a36d67f2f0c83106c1e771ea2c2fcb7fd83a (patch) | |
| tree | 522ef465095e93f13d5c74b2fea6414f8b342b33 /NOTE.md | |
| parent | 245e1174695c819642030461e3f544dffb7062fd (diff) | |
Add experiment notes and .gitignore
Track experiment phases (debug/pilot/frozen), key findings, and design decisions.
Diffstat (limited to 'NOTE.md')
| -rw-r--r-- | NOTE.md | 71 |
1 files changed, 57 insertions, 14 deletions
@@ -1,22 +1,65 @@ # Experiment Notes -## 2026-03-23: Initial Implementation and Experiments +## Experiment Phases +- **debug**: Initial implementation, rapid iteration. Code may change between runs. +- **pilot**: Controlled iteration. Each change requires commit + rationale. +- **frozen**: Code frozen at specific commit hash. Only formal multi-seed runs. + +## Current Phase: PILOT +- Commit for toy frozen runs: `0b9ebb2` (state bridge synced to normalized MSE) +- CIFAR runs started from commit `ce24e36` (CIFAR code unchanged by sync commit) + +--- + +## 2026-03-23: Implementation and Experiments ### Setup -- GPU: NVIDIA RTX A6000 x4 (using GPU 1) +- GPU: NVIDIA RTX A6000 x4 (GPU 0 occupied, using GPUs 1-3) - PyTorch 2.10.0+cu128 -- All code written from scratch following CLAUDE.md specifications -### Phase A: Toy LQ Sanity Check -- Status: Running... -- Config: d=64, m=10, L=12, sigma=0.03, 5000 steps, batch=256 -- Methods: DFA, State Bridge, Credit Bridge +### Key Findings + +#### 1. Credit Bridge requires terminal gradient matching +- **Without** terminal gradient matching: credit bridge costate cosine collapses to ~0.03 (no signal) +- **With** terminal gradient matching: credit bridge achieves ~0.94 cosine (matches state bridge) +- Terminal gradient uses only output-layer local info (not hidden BP) → allowed +- This is the most important finding so far + +#### 2. Toy LQ Results (3 seeds, 8000 steps, commit 0b9ebb2) +| Method | Costate Cosine | Perturbation ρ | Nudging | +|--------|---------------|----------------|---------| +| DFA | 0.003±0.001 | 0.010±0.012 | -0.001±0.000 | +| State Bridge | 0.941±0.003 | 0.927±0.004 | -0.335±0.015 | +| Credit Bridge | 0.942±0.002 | 0.929±0.003 | -0.334±0.015 | + +- Both State Bridge and Credit Bridge match closely on the linear system +- DFA provides essentially no directional credit (random level) +- Bridge residual decreases steadily during training +- FM auxiliary provides marginal improvement (0.946 vs 0.940 cosine) + +#### 3. CIFAR-10 (in progress, 3 seeds on GPUs 1-3) +- BP baseline: ~59% test accuracy (expected for flat MLP on CIFAR-10) +- DFA: ~28% test accuracy at epoch 30 (struggling on deep network) +- State Bridge: running +- Credit Bridge: running with warmup (20% DFA warmup + linear blend) + +### Design Decisions +1. **Terminal gradient matching** (term_grad_weight=1.0): Essential for credit bridge. The bridge consistency loss alone constrains V values but not gradients. Terminal gradient matching provides curvature info from output-layer-local computation. +2. **DFA warmup for credit bridge**: Without warmup, the credit bridge collapses because value net can't learn useful credits while forward net is being updated with random signals. +3. **Normalized MSE for state bridge**: `((pred - target) / max(||target||, 1.0))^2` for numerical stability on CIFAR where hidden states can have large norms. +4. **Credit normalization**: All methods use `a_norm = a / (RMS(a) + 1e-6)` in local surrogate to control credit magnitude. ### Changes Log -- Created full project structure: models/, methods/, experiments/, metrics/, configs/ -- models/residual_mlp.py: ResidualMLP with pre-LayerNorm residual blocks -- models/value_net.py: ValueNet V_phi with sinusoidal time embedding -- models/state_bridge.py: StateBridgeNet G_psi -- experiments/toy_lq.py: Linear-quadratic sanity check -- experiments/cifar_resmlp.py: CIFAR-10 main experiment -- metrics/credit_metrics.py: All diagnostic metrics +- `ce24e36`: Initial implementation with all models, methods, toy and CIFAR experiments +- `0b9ebb2`: Sync state bridge to use normalized MSE in both toy and CIFAR (consistency fix) + +### Experiment IDs +- `toy_lq_v1`: Original toy, no terminal gradient matching (for ablation) +- `toy_lq_v2`: Toy with terminal gradient matching (primary) +- `toy_lq_frozen`: Re-run of v2 with synced state bridge (for final report) +- `cifar10_seed42/123/456`: Main CIFAR-10 experiments + +### Known Issues +- DFA accuracy on CIFAR-10 is low (~28% at epoch 30). Expected for DFA on deep MLPs. +- State bridge had astronomical prediction errors before normalization fix. +- Credit bridge needs DFA warmup phase to bootstrap stable training. |
