diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
| commit | 66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch) | |
| tree | c29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/paper/outline.md | |
Curated export for clone-and-run Maze training (2x A6000) + diagnostics.
trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible).
Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/paper/outline.md')
| -rw-r--r-- | research/flossing/paper/outline.md | 79 |
1 files changed, 79 insertions, 0 deletions
diff --git a/research/flossing/paper/outline.md b/research/flossing/paper/outline.md new file mode 100644 index 0000000..0dde354 --- /dev/null +++ b/research/flossing/paper/outline.md @@ -0,0 +1,79 @@ +# Outline — "Recursive Reasoning Models Fail by Wandering, Not by Settling" (title FIXED 2026-06-12) + +Status: intro.md ✅ (v2, audited) · setup_results.md ✅ (Secs 2–3) · style_contract.md ✅ · +remaining: Sec 4 (relation to prior accounts), Sec 5 (implications), Sec 6 (limitations), +abstract, tables T1–T3 + figures F3/F4 composition. + +Target: ~8 pages main. Every section header below lists [claims served] and [assets]. + +## 1 Introduction [C1, spine] +- Para 1: recursive reasoners (HRM/TRM) solve hard puzzles by iterating a latent state; when they + fail, what is dynamically different? Existing mechanistic accounts infer dynamics from loss + curves and 2-D projections; we measure the dynamics directly, per example. +- Para 2: the answer, with numbers (settling × correctness decomposition; B≈0; AUC 0.99; + concurrent-not-antecedent). +- Para 3: contributions (4 items, one line each): (i) per-example outcome-conditioned FTLE/settling + measurement at n≤8192 across two architectures; (ii) failure-mode decomposition correcting two + published labels; (iii) independence controls (drift-matched, difficulty-binned); (iv) the + early-window null + sign reversal. +- NO general AI-reasoning throat-clearing. First sentence is about the object of study. + +## 2 Setup [assets: estimator details from diagnose_trm_joint.py; OBSERVATIONS.md provenance table] +- 2.1 Models & task: HRM 27M @26040 (acc .526), TRM-MLP official recipe @58590 (acc .876), + Sudoku-Extreme-1k-aug; fixed 16-step unroll, ACT recorded not applied. +- 2.2 Measurements: joint (z_H,z_L) tangent dynamics, JVP+QR, k=8, per-sub-update normalization; + per-ACT-step state displacement (drift); q_halt; exact/token accuracy. Estimator-scale caveat. +- 2.3 The 2×2 design: settled band defined by bimodal late-drift split (Otsu primary, full + percentile sweep + threshold-free statement in appendix); cells A/B/C/D. + +## 3 Results +- 3.1 Decomposition [C1, C2, C3; assets: cells tables, fig_*_scatter, fig_*_lyap_by_cell, + strict-B table + fig_hrm_strictB_profiles] + Lead: "Across 2048–8192 held-out puzzles, no TRM failure and 0.55% of HRM failures end in the + settled band." Then per-cell λ₁; then the 21 selector-blind examples (their three lowest + token-acc are all 17-givens puzzles). +- 3.2 What the signal is not [C4; assets: decile table, givens table] + Drift-matched AUC 0.88–0.90; givens-binned AUC unchanged. One paragraph each, tables carry + the numbers. +- 3.3 When the signal exists [C5; assets: early_pairing_{trm,hrm}.md tables] + The early-window null; the HRM sign reversal (drift@4 +direction AUC 0.688); q_halt@4 0.734 + vs TRM 0.521 (factual note: TRM removed the continue head). Frame as the temporal anatomy of + the signature. +- 3.4 Training evolution [C7; assets: evolution_{trm,hrm}.png/csv; multi4 quick-compare] + Gap widens via λ₁(D); multi4 shrinks D-cell mass at matched steps (preliminary, objective + caveat); multi4 collapse = λ₁(A) sign flip. + +## 4 Relation to prior accounts [C6a, C6b; assets: papers/notes/*] +- Para 1: network-level Lyapunov–performance work (Vogt 2022; AeLLE 2024; Engelken flossing + App. D.3 trains-vs-fails at network level, opposite sign) → none condition per example on outcome. +- Para 2: the 2026 mechanistic trio. Efstathiou & Balwani: credit loss/boundedness/intervention; + quote and correct the settledness reading (C6a). Ren & Liu: confirm + quantify their taxonomy + (C6b). Es'kin & Smorkalov (CMM): their endpoint-stability losses + engineered early repeller + are consistent, at the design level, with where our measurements localize the signal — cite, + don't claim confirmation. +- Para 3: stability-by-construction line (monDEQ, Jacobian-reg DEQ, REN/Sandwich; TRM's own + TorchDEQ negative result; Solve-the-Loop) — what "enforce settling" buys and where it failed; + our measurements say which kind of settling is the operative one. + +## 5 Implications (restrained, half page) +- Intervention design space bifurcates: widen/deepen the settled tube at training time + (perturbation training, equilibrium losses) vs restart-and-select at inference + (q_halt tracks correctness at trajectory end; selector-blind ceiling ≈0.5%). +- Early pruning/reallocation unsupported at 4-step granularity; on HRM the gradient of usable + early signal lives in the learned head, not the generic dynamical quantities. + +## 6 Limitations & future +Sudoku-Extreme only; two models; #givens is a weak difficulty proxy (solver backtracks next); +single early horizon (sweep queued); end-of-window criterion blind to mid-trajectory lingering; +no mechanism offered for why settling fails — measurement paper. + +## Figures plan (all exist or one rerun away) +F1: drift–λ₁ scatter, both models (have). +F2: per-cell λ₁ + strict-B profiles inset (have). +F3: decile-matched AUC + givens-binned AUC (compose from CSVs). +F4: early-window pairing summary (compose: 3 signals × 2 models, restricted set). +F5: checkpoint evolution (have). + +## Order of writing +1. Results 3.1–3.3 (numbers already final) → 2. Setup → 3. Sec 4 (notes ready) → 4. Intro → +5. Implications/Limitations → 6. style pass against claims.md checklist. |
