summaryrefslogtreecommitdiff
path: root/research/flossing/srm_design_codex.md
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
committerYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
commit66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch)
treec29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/srm_design_codex.md
rrm workspace: TRM/HRM/SRM code, Maze dataset, dynamical-analysis pipelineHEADmain
Curated export for clone-and-run Maze training (2x A6000) + diagnostics. trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible). Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/srm_design_codex.md')
-rw-r--r--research/flossing/srm_design_codex.md69
1 files changed, 69 insertions, 0 deletions
diff --git a/research/flossing/srm_design_codex.md b/research/flossing/srm_design_codex.md
new file mode 100644
index 0000000..6d5a167
--- /dev/null
+++ b/research/flossing/srm_design_codex.md
@@ -0,0 +1,69 @@
+# SRM (Stable Recursion Model) — Codex Design Synthesis
+
+Returned 2026-05-22 by codex-rescue.
+
+## Core insight: target mild contraction, not aggressive
+
+Empirical λ_1(success) ≈ -0.15 → effective gain ≈ exp(-0.15) ≈ **0.86**.
+Empirical λ_1(failure) ≈ +0.04 → gain ≈ **1.04**.
+
+**Target κ ∈ (0.85, 0.95)**, NOT 0.3-0.5. Over-contraction kills constraint propagation in Sudoku.
+
+## Architectural sketch
+
+State: `z = (h, ℓ) ∈ R^{d_H + d_L}`, weighted norm `‖z‖_P² = ‖h‖² + η·‖ℓ‖²`.
+
+Joint feature map `ψ_θ(z, x)` via **Sandwich Layers** (Wang & Manchester 2023) constrained `Lip_P(ψ) ≤ 1`.
+
+Block gain operator:
+```
+A = [[a_HH·I, a_HL·U_HL],
+ [a_LH·U_LH, a_LL·I]]
+```
+where `U_HL, U_LH` orthogonal, and gains satisfy block-row-sum under weighted metric:
+- `a_HH + √η · a_HL ≤ κ`
+- `a_LL + η^{-1/2} · a_LH ≤ κ`
+
+with `κ ∈ (0.85, 0.95)`.
+
+Update rule:
+```
+z_{t+1} = (1-α) z_t + α · A · ψ_θ(z_t, x) + b(x)
+```
+
+⇒ `Lip_P(T) ≤ (1-α) + α·κ < 1` by construction.
+
+With `α=1, κ=0.86`: λ_1 ≤ log(0.86) ≈ -0.15 — exactly matches empirical success regime.
+
+## Key methodological corrections vs my initial sketch
+
+1. **Constrain JOINT operator, not individual blocks**. HRM got this wrong: stable H and stable L don't imply stable joint due to cross-coupling J_HL, J_LH. Block-row-sum bound under weighted metric is the right translation of CF's empirical signal.
+
+2. **Use tied-time but single joint operator**: TRM's weight-tying across iterations is good (turns it into iterative solver). But fold H and L into one joint operator (unlike HRM's separate modules) to enforce shared contraction metric.
+
+3. **Damping alone isn't sufficient**: `z + β·f(z)` only contracts if f is already Lipschitz-bounded. Damping is for margin, not the main guarantee.
+
+## Failure modes to watch
+
+1. **Over-contraction**: κ too low → constraint propagation collapses → underperforms TRM
+2. **Fake certification**: approximate spectral norm leaves hidden expansion directions; use exact Sandwich parameterization
+3. **Cross-coupling starvation**: `a_HL, a_LH → 0` → decoupled two-state system loses reasoning capacity (need lower bounds on coupling gains too?)
+
+## Literature to anchor
+
+Primary:
+- **Sandwich Layers** (Wang & Manchester 2023) — exact Lipschitz parameterization
+- **Deep Equilibrium Models** (Bai et al.) — for the fixed-point formulation
+
+Secondary (conceptual):
+- Lipschitz RNN (Erichson 2021)
+- AntisymmetricRNN (Chang 2019)
+- (CoRNN less relevant; oscillatory not the right inductive bias for Sudoku)
+
+## Implementation path
+
+1. Replace HRM/TRM's L_level/H_level with a **single tied joint operator** on (z_H, z_L)
+2. Implement Sandwich layer ψ with `Lip ≤ 1`
+3. Parameterize block gain matrix A with constraint `a_HH + √η·a_HL ≤ κ`, `a_LL + η^{-1/2}·a_LH ≤ κ`
+4. α as learnable sigmoid (margin), κ as hyperparameter or learnable bounded < 1
+5. Sweep κ over {0.85, 0.90, 0.95} to find expressivity sweet spot