summaryrefslogtreecommitdiff
path: root/research/flossing/paper/sample_intro.md
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
committerYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
commit66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch)
treec29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/paper/sample_intro.md
rrm workspace: TRM/HRM/SRM code, Maze dataset, dynamical-analysis pipelineHEADmain
Curated export for clone-and-run Maze training (2x A6000) + diagnostics. trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible). Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/paper/sample_intro.md')
-rw-r--r--research/flossing/paper/sample_intro.md49
1 files changed, 49 insertions, 0 deletions
diff --git a/research/flossing/paper/sample_intro.md b/research/flossing/paper/sample_intro.md
new file mode 100644
index 0000000..183faa4
--- /dev/null
+++ b/research/flossing/paper/sample_intro.md
@@ -0,0 +1,49 @@
+# Sample section: Introduction (taste-calibration draft)
+
+Recursive reasoning models solve constraint-satisfaction problems that defeat much larger
+language models by iterating a small network on a latent state — up to several hundred state
+updates per puzzle in the Hierarchical Reasoning Model (HRM) and the Tiny Recursive Model
+(TRM). When such a model fails, what is dynamically different about the trajectory it
+produced? Recent mechanistic studies have answered with attractor language: failed runs
+"plateau at stable high-loss attractors" (Efstathiou & Balwani, 2026), or converge to spurious
+fixed points that rival the correct one (Ren & Liu, 2026). These accounts rest on indirect
+evidence — loss plateaus, two-dimensional projections of 512-dimensional trajectories — and
+the two papers do not agree: one describes failure as premature stability, the other partly as
+wandering. Neither measures stability itself.
+
+We measure it directly. For every test puzzle we record two per-example quantities along the
+full 16-segment inference trajectory: the finite-time Lyapunov spectrum of the joint latent
+dynamics, and the per-segment state displacement. Conditioning these on outcome over 2,048 to
+8,192 puzzles per model yields a complete decomposition of failure for HRM (52.6% accuracy)
+and an official-recipe TRM (87.6%), and the decomposition contradicts the settled-attractor
+picture. Correct trajectories enter a narrow low-velocity band and stay in it; failed
+trajectories never do. In TRM, not one of 254 failures settles — the least mobile failure still
+moves faster at the end of inference than 96.5% of successes — while remaining locally
+expansive (median λ₁ = +0.103 versus +0.012 for successes; AUC 0.993). In HRM, settled-but-wrong
+trajectories exist but account for 0.55% of failures; the other 99.45% wander. Failure in these
+models is not a wrong attractor. It is the sustained absence of settling.
+
+Two controls sharpen what the Lyapunov signature adds. Matched for displacement level within
+the unsettled population, λ₁ still separates eventual successes from failures (decile-matched
+AUC 0.88–0.90), so the exponent is not merely re-measuring non-convergence; and binning by
+puzzle givens leaves the separation intact (within-bin AUC 0.982 versus 0.984 overall), so it
+is not a difficulty artifact. The signature is, however, strictly retrospective. Restricted to
+puzzles still unsolved after four segments, nothing dynamical about those first four segments
+predicts which will eventually be solved: AUC ≈ 0.5 in TRM for exponent, displacement, and
+halting confidence alike — and in HRM the association inverts, with eventual successes moving
+*more* in the early trajectory than eventual failures (AUC 0.69 in the positive direction).
+The chaos of failure is concurrent with the outcome, not an omen visible at the start.
+
+These measurements reframe both the diagnosis and the levers. Because failure is almost never
+a stable wrong answer, selection-based inference strategies have a high ceiling — final-step
+halting confidence tracks correctness on all but the ~0.5% of failures that settle confidently
+— and because the early trajectory carries no dynamical death sentence, compute is better
+spent on restarts than on early pruning. We quantify both points, correct the published
+attractor labels they depend on, and release the per-example measurement tooling.
+
+---
+*[Style notes for review, not part of the draft: (1) every paragraph opens with a finding or a
+question, none with "In recent years"; (2) the two prior papers are quoted precisely and
+credited for what their data shows before the correction is made; (3) hedges appear only where
+the claim table concedes (e.g., "almost never", "~0.5%"); (4) the one rhetorical flourish —
+"not an omen" — is load-bearing; cut it if it reads as flavor.]*