summaryrefslogtreecommitdiff
path: root/README_experiments.md
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-03-23 19:46:08 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-03-23 19:46:08 -0500
commit32123cb36ae9521f60c9b6f67458b931b6540ef2 (patch)
tree4731e1dc513f5b613f80c4d20fc4114044c266d3 /README_experiments.md
parentbbb1a36d67f2f0c83106c1e771ea2c2fcb7fd83a (diff)
Add final report, plots, experiment guide, and complete NOTE.md
All experiments complete: - Toy LQ: credit bridge matches state bridge (~0.94 costate cosine) - CIFAR-10: credit bridge (29.6%) comparable to DFA (30.0%), both beat state bridge (18.5%) - State bridge confirms core hypothesis: perfect state prediction != useful credit - Terminal gradient matching is essential for credit bridge
Diffstat (limited to 'README_experiments.md')
-rw-r--r--README_experiments.md89
1 files changed, 89 insertions, 0 deletions
diff --git a/README_experiments.md b/README_experiments.md
new file mode 100644
index 0000000..0fa60cd
--- /dev/null
+++ b/README_experiments.md
@@ -0,0 +1,89 @@
+# Experiment Guide
+
+## Requirements
+- Python 3.10+
+- PyTorch 2.x with CUDA
+- torchvision, numpy, scipy, matplotlib
+
+## Project Structure
+```
+models/
+ residual_mlp.py - Deep residual MLP (pre-LayerNorm + GELU blocks)
+ value_net.py - Scalar value network V_phi for credit bridge
+ state_bridge.py - State predictor G_psi for state bridge
+
+experiments/
+ toy_lq_v2.py - Phase A: Linear-quadratic sanity check
+ cifar_resmlp.py - Phase B: CIFAR-10 main experiment
+ plot_toy_final.py - Generate toy plots
+ plot_cifar_final.py - Generate CIFAR plots
+
+metrics/
+ credit_metrics.py - Diagnostic metrics (cosine, rho, nudging, etc.)
+
+configs/ - YAML configs
+report/ - Plots and final report
+results/ - Experiment outputs
+```
+
+## Running Experiments
+
+### Phase A: Toy LQ Sanity Check
+```bash
+# Single seed
+CUDA_VISIBLE_DEVICES=0 python experiments/toy_lq_v2.py \
+ --gpu 0 --seed 42 --num_steps 8000 \
+ --sigma_bridge 0.1 --lam 0.1 \
+ --term_grad_weight 1.0 --fm_weight 0.0 \
+ --output_dir results/toy_lq_frozen
+
+# All 3 seeds
+for seed in 42 123 456; do
+ CUDA_VISIBLE_DEVICES=0 python experiments/toy_lq_v2.py \
+ --gpu 0 --seed $seed --num_steps 8000 \
+ --sigma_bridge 0.1 --lam 0.1 \
+ --term_grad_weight 1.0 --fm_weight 0.0 \
+ --output_dir results/toy_lq_frozen
+done
+```
+
+### Phase B: CIFAR-10 Main Experiment
+```bash
+# Single seed (runs BP, DFA, State Bridge, Credit Bridge sequentially)
+CUDA_VISIBLE_DEVICES=0 python experiments/cifar_resmlp.py \
+ --dataset cifar10 --d_hidden 512 --num_blocks 12 \
+ --epochs 100 --seeds 42 --gpu 0 \
+ --output_dir results/cifar10
+
+# Parallel across GPUs
+CUDA_VISIBLE_DEVICES=0 python experiments/cifar_resmlp.py --seeds 42 --output_dir results/cifar10 --gpu 0 &
+CUDA_VISIBLE_DEVICES=1 python experiments/cifar_resmlp.py --seeds 123 --output_dir results/cifar10_seed123 --gpu 0 &
+CUDA_VISIBLE_DEVICES=2 python experiments/cifar_resmlp.py --seeds 456 --output_dir results/cifar10_seed456 --gpu 0 &
+wait
+```
+
+### Generate Plots
+```bash
+python experiments/plot_toy_final.py
+python experiments/plot_cifar_final.py
+```
+
+## Key Parameters
+| Parameter | Toy LQ | CIFAR-10 | Description |
+|-----------|--------|----------|-------------|
+| d_hidden | 64 | 512 | Hidden dimension |
+| num_layers/blocks | 12 | 12 | Depth |
+| sigma_bridge | 0.1 | 0.05 | Bridge noise std |
+| lam | 0.1 | 0.1 | Temperature |
+| K | 8 | 4 | MC samples for bridge target |
+| term_grad_weight | 1.0 | 1.0 | Terminal gradient matching weight |
+| ema_momentum | 0.995 | 0.995 | EMA for target network |
+| lr_fb | 1e-3 | 1e-3 | Feedback net learning rate |
+
+## Implementation Notes
+- **No hidden BP anchor**: Non-BP methods never use exact backprop through hidden layers.
+- **Detached hidden copies**: All feedback/value net inputs use `detach().requires_grad_(True)`.
+- **Block-local updates**: Each block's parameters updated only from its local forward + credit signal.
+- **Output head**: Uses exact CE gradient with detached h_L.
+- **Terminal gradient matching**: Matches grad_h V at terminal layer to grad_{h_L} CE. This is output-layer-local information, not hidden BP.
+- **Credit bridge warmup**: First 20% epochs use DFA credits, then linearly blend to credit bridge credits.