rrm workspace: TRM/HRM/SRM code, Maze dataset, dynamical-analysis pipelineHEAD main

Curated export for clone-and-run Maze training (2x A6000) + diagnostics. trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible). Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
author: YurenHao0426 <blackhao0426@gmail.com> 2026-06-13 12:35:36 -0500
committer: YurenHao0426 <blackhao0426@gmail.com> 2026-06-13 12:35:36 -0500
commit: 66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch)
tree: c29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/paper/rainer_followup_draft.md
1 files changed, 37 insertions, 0 deletions
diff --git a/research/flossing/paper/rainer_followup_draft.md b/research/flossing/paper/rainer_followup_draft.md
new file mode 100644
index 0000000..12091de
--- /dev/null
+++ b/research/flossing/paper/rainer_followup_draft.md
@@ -0,0 +1,37 @@
+Subject: Re: Question on gradient flossing vs forward trajectory stability in recursive reasoning models
+
+Hi Rainer,
+
+A short follow-up to my email of June 5 — we have since measured the things I was speculating
+about, and two results seem worth sharing because they sharpen the question I asked you.
+
+First, conditioning per-example finite-time Lyapunov spectra on both outcome and terminal
+settling (n = 2048–8192, two architectures) shows that failure is almost exclusively
+non-settling: in an official-recipe TRM at 87.6% accuracy, none of 254 failed trajectories
+ever enters the low-velocity band that all successes occupy, and they remain locally expansive
+to the end (median λ₁ +0.10 vs +0.01). "Converged to the wrong attractor" failures exist in
+HRM but make up only ~0.5% of failures. The chaotic signature also survives two controls: it
+persists after matching trajectories on displacement level (so it is not just re-measuring
+non-convergence), and after binning by puzzle difficulty.
+
+Second — and this is the part that genuinely surprised us — the signature is strictly
+concurrent. Among puzzles still unsolved after a quarter of the inference budget, neither the
+early-window exponents nor early state velocity predict which trajectories will eventually
+succeed (AUC ≈ 0.5); in HRM the association even inverts, with eventually-successful
+trajectories moving more in the early phase. So the failed trajectories are not "born chaotic":
+chaos at the end and failure appear together.
+
+This makes me think the right framing for my earlier question is reachability of the settled
+region (escape from a long chaotic transient) rather than per-example landscape quality, which
+would be consistent with your view of flossing as a learning-time tool rather than an
+inference-time one. If you know of work that conditions finite-time exponents on trajectory
+fate in this way — in transient-chaos settings or elsewhere — I would be grateful for a
+pointer; we have not found a precedent.
+
+Best,
+Yuren
+
+---
+[Notes, not part of the email: numbers from analysis_2x2/OBSERVATIONS.md addenda 1-2. Send only
+if/after Rainer replies to the June 5 email, or as a gentle bump after ~2 weeks (June 19+).
+The "born chaotic" phrasing mirrors his literature's transient-chaos vocabulary deliberately.]
author	YurenHao0426 <blackhao0426@gmail.com>	2026-06-13 12:35:36 -0500
committer	YurenHao0426 <blackhao0426@gmail.com>	2026-06-13 12:35:36 -0500
commit	66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch)
tree	c29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/paper/rainer_followup_draft.md