research/flossing/paper/outline.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79

# Outline — "Recursive Reasoning Models Fail by Wandering, Not by Settling" (title FIXED 2026-06-12)

Status: intro.md ✅ (v2, audited) · setup_results.md ✅ (Secs 2–3) · style_contract.md ✅ ·
remaining: Sec 4 (relation to prior accounts), Sec 5 (implications), Sec 6 (limitations),
abstract, tables T1–T3 + figures F3/F4 composition.

Target: ~8 pages main. Every section header below lists [claims served] and [assets].

## 1 Introduction [C1, spine]
- Para 1: recursive reasoners (HRM/TRM) solve hard puzzles by iterating a latent state; when they
  fail, what is dynamically different? Existing mechanistic accounts infer dynamics from loss
  curves and 2-D projections; we measure the dynamics directly, per example.
- Para 2: the answer, with numbers (settling × correctness decomposition; B≈0; AUC 0.99;
  concurrent-not-antecedent).
- Para 3: contributions (4 items, one line each): (i) per-example outcome-conditioned FTLE/settling
  measurement at n≤8192 across two architectures; (ii) failure-mode decomposition correcting two
  published labels; (iii) independence controls (drift-matched, difficulty-binned); (iv) the
  early-window null + sign reversal.
- NO general AI-reasoning throat-clearing. First sentence is about the object of study.

## 2 Setup [assets: estimator details from diagnose_trm_joint.py; OBSERVATIONS.md provenance table]
- 2.1 Models & task: HRM 27M @26040 (acc .526), TRM-MLP official recipe @58590 (acc .876),
  Sudoku-Extreme-1k-aug; fixed 16-step unroll, ACT recorded not applied.
- 2.2 Measurements: joint (z_H,z_L) tangent dynamics, JVP+QR, k=8, per-sub-update normalization;
  per-ACT-step state displacement (drift); q_halt; exact/token accuracy. Estimator-scale caveat.
- 2.3 The 2×2 design: settled band defined by bimodal late-drift split (Otsu primary, full
  percentile sweep + threshold-free statement in appendix); cells A/B/C/D.

## 3 Results
- 3.1 Decomposition [C1, C2, C3; assets: cells tables, fig_*_scatter, fig_*_lyap_by_cell,
  strict-B table + fig_hrm_strictB_profiles]
  Lead: "Across 2048–8192 held-out puzzles, no TRM failure and 0.55% of HRM failures end in the
  settled band." Then per-cell λ₁; then the 21 selector-blind examples (their three lowest
  token-acc are all 17-givens puzzles).
- 3.2 What the signal is not [C4; assets: decile table, givens table]
  Drift-matched AUC 0.88–0.90; givens-binned AUC unchanged. One paragraph each, tables carry
  the numbers.
- 3.3 When the signal exists [C5; assets: early_pairing_{trm,hrm}.md tables]
  The early-window null; the HRM sign reversal (drift@4 +direction AUC 0.688); q_halt@4 0.734
  vs TRM 0.521 (factual note: TRM removed the continue head). Frame as the temporal anatomy of
  the signature.
- 3.4 Training evolution [C7; assets: evolution_{trm,hrm}.png/csv; multi4 quick-compare]
  Gap widens via λ₁(D); multi4 shrinks D-cell mass at matched steps (preliminary, objective
  caveat); multi4 collapse = λ₁(A) sign flip.

## 4 Relation to prior accounts [C6a, C6b; assets: papers/notes/*]
- Para 1: network-level Lyapunov–performance work (Vogt 2022; AeLLE 2024; Engelken flossing
  App. D.3 trains-vs-fails at network level, opposite sign) → none condition per example on outcome.
- Para 2: the 2026 mechanistic trio. Efstathiou & Balwani: credit loss/boundedness/intervention;
  quote and correct the settledness reading (C6a). Ren & Liu: confirm + quantify their taxonomy
  (C6b). Es'kin & Smorkalov (CMM): their endpoint-stability losses + engineered early repeller
  are consistent, at the design level, with where our measurements localize the signal — cite,
  don't claim confirmation.
- Para 3: stability-by-construction line (monDEQ, Jacobian-reg DEQ, REN/Sandwich; TRM's own
  TorchDEQ negative result; Solve-the-Loop) — what "enforce settling" buys and where it failed;
  our measurements say which kind of settling is the operative one.

## 5 Implications (restrained, half page)
- Intervention design space bifurcates: widen/deepen the settled tube at training time
  (perturbation training, equilibrium losses) vs restart-and-select at inference
  (q_halt tracks correctness at trajectory end; selector-blind ceiling ≈0.5%).
- Early pruning/reallocation unsupported at 4-step granularity; on HRM the gradient of usable
  early signal lives in the learned head, not the generic dynamical quantities.

## 6 Limitations & future
Sudoku-Extreme only; two models; #givens is a weak difficulty proxy (solver backtracks next);
single early horizon (sweep queued); end-of-window criterion blind to mid-trajectory lingering;
no mechanism offered for why settling fails — measurement paper.

## Figures plan (all exist or one rerun away)
F1: drift–λ₁ scatter, both models (have).
F2: per-cell λ₁ + strict-B profiles inset (have).
F3: decile-matched AUC + givens-binned AUC (compose from CSVs).
F4: early-window pairing summary (compose: 3 signals × 2 models, restricted set).
F5: checkpoint evolution (have).

## Order of writing
1. Results 3.1–3.3 (numbers already final) → 2. Setup → 3. Sec 4 (notes ready) → 4. Intro →
5. Implications/Limitations → 6. style pass against claims.md checklist.