diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
| commit | 66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch) | |
| tree | c29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/paper/rainer_followup_draft.md | |
Curated export for clone-and-run Maze training (2x A6000) + diagnostics.
trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible).
Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/paper/rainer_followup_draft.md')
| -rw-r--r-- | research/flossing/paper/rainer_followup_draft.md | 37 |
1 files changed, 37 insertions, 0 deletions
diff --git a/research/flossing/paper/rainer_followup_draft.md b/research/flossing/paper/rainer_followup_draft.md new file mode 100644 index 0000000..12091de --- /dev/null +++ b/research/flossing/paper/rainer_followup_draft.md @@ -0,0 +1,37 @@ +Subject: Re: Question on gradient flossing vs forward trajectory stability in recursive reasoning models + +Hi Rainer, + +A short follow-up to my email of June 5 — we have since measured the things I was speculating +about, and two results seem worth sharing because they sharpen the question I asked you. + +First, conditioning per-example finite-time Lyapunov spectra on both outcome and terminal +settling (n = 2048–8192, two architectures) shows that failure is almost exclusively +non-settling: in an official-recipe TRM at 87.6% accuracy, none of 254 failed trajectories +ever enters the low-velocity band that all successes occupy, and they remain locally expansive +to the end (median λ₁ +0.10 vs +0.01). "Converged to the wrong attractor" failures exist in +HRM but make up only ~0.5% of failures. The chaotic signature also survives two controls: it +persists after matching trajectories on displacement level (so it is not just re-measuring +non-convergence), and after binning by puzzle difficulty. + +Second — and this is the part that genuinely surprised us — the signature is strictly +concurrent. Among puzzles still unsolved after a quarter of the inference budget, neither the +early-window exponents nor early state velocity predict which trajectories will eventually +succeed (AUC ≈ 0.5); in HRM the association even inverts, with eventually-successful +trajectories moving more in the early phase. So the failed trajectories are not "born chaotic": +chaos at the end and failure appear together. + +This makes me think the right framing for my earlier question is reachability of the settled +region (escape from a long chaotic transient) rather than per-example landscape quality, which +would be consistent with your view of flossing as a learning-time tool rather than an +inference-time one. If you know of work that conditions finite-time exponents on trajectory +fate in this way — in transient-chaos settings or elsewhere — I would be grateful for a +pointer; we have not found a precedent. + +Best, +Yuren + +--- +[Notes, not part of the email: numbers from analysis_2x2/OBSERVATIONS.md addenda 1-2. Send only +if/after Rainer replies to the June 5 email, or as a gentle bump after ~2 weeks (June 19+). +The "born chaotic" phrasing mirrors his literature's transient-chaos vocabulary deliberately.] |
