From a6ec4288a2232988b130b2f00bb2565f81706966 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Mon, 29 Jun 2026 12:15:51 -0500 Subject: Recursive reasoning dynamics: analysis pipeline, paper drafts, toy models Failure=more-chaotic (task-general under validity labeling) reduces to convergence/completeness detection; mechanism (transient chaos vs multistability vs input-induced) under investigation. Co-Authored-By: Claude Fable 5 --- maze_package/README.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) create mode 100644 maze_package/README.md (limited to 'maze_package/README.md') diff --git a/maze_package/README.md b/maze_package/README.md new file mode 100644 index 0000000..45ab740 --- /dev/null +++ b/maze_package/README.md @@ -0,0 +1,30 @@ +# Maze-Hard package (E8) — train on dedicated cards, diagnose after + +## Contents +- `launch_maze_trm.sh` — TRM Maze official recipe (att variant, 50k epochs), 1–2 GPU. +- dataset already at `/home/yurenh2/rrm/data/maze-30x30-hard-1k` (built 2026-06-13; + seq_len 900, vocab 6, 1000 puzzles ×8 dihedral augments). + +## Run +```bash +bash launch_maze_trm.sh 2 384 # 2x A6000 +bash launch_maze_trm.sh 2 192 # 2x A5000 (->128 if OOM) +``` +Target: ~75% exact accuracy (official figure). Saves a checkpoint every 5000 epochs +(10 checkpoints) — needed for the evolution analysis. + +## After training: diagnostics +The 2x2 / FTLE pipeline reads any TRM checkpoint dir (all_config.yaml + step_N). Two caveats +vs Sudoku, to verify on first run: +1. ATTENTION arch (not mlp_t): confirm diagnose_trm_joint.py's JVP path runs on att blocks + (Sudoku used mlp_t). If the L_level call signature differs, patch the f_L/f_H closures. +2. seq_len 900 vs 97 → per-sample JVP+QR cost ~9-10x Sudoku. Use n=512 for the headline 2x2 + and n=256 for the horizon sweep; k_lyap=8 unchanged. Budget ~0.5-1 day on one card, or + rsync checkpoints back to the lab box and run via the analysis_2x2 queue. + +## What Maze closes +Kills the "Sudoku-only" limitation. Pre-registered prediction (write BEFORE looking, for the +paper's credibility): if the wandering-not-settling decomposition is architecture/task-general, +Maze should show B≈0 (failures don't settle) and the same concurrent-not-antecedent horizon +profile. A DIFFERENT result (e.g. Maze failures do settle) is also publishable — it bounds the +claim's scope. Either way the decomposition gets a second task. -- cgit v1.2.3