1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
|
# rrm — Recursive Reasoning Models workspace
Reproduction + dynamical analysis of recursive reasoning models (HRM, TRM) and the SRM design.
This repo is the transport/working copy; heavy artifacts (checkpoints, wandb, npz) are gitignored.
See `PROVENANCE.md` for upstream pins and our modifications.
## Quickstart on a fresh machine (e.g. the 2x A6000 box) — clone and run Maze
```bash
git clone git@github.com:YurenHao0426/rrm.git && cd rrm
# 1. Environment (once). Creates conda env 'rrm': torch 2.7+cu126, flash-attn 2 (Ampere), adam-atan2.
bash env/setup.sh
# 2. Optional smoke test (a few minutes) — confirms attention model + flash-attn + data load, no OOM.
SMOKE=1 bash run_maze_a6000.sh
# 3. Full Maze-Hard training (TRM official att recipe, 50k epochs).
bash run_maze_a6000.sh 2 384 # 2x A6000 (48G); ~18-28h, target ~75% exact acc
# bash run_maze_a6000.sh 2 192 # if 2x A5000 (24G); ->128 if OOM
# bash run_maze_a6000.sh 1 192 # single card
```
Checkpoints land in `trm/checkpoints/maze-30x30-hard-1k.../<run_name>/`, one per 5000 epochs
(10 total — keep all; the evolution analysis needs them).
## After training: diagnostics
Preferred: `rsync` the run's checkpoint dir back to the lab box and run the `research/flossing`
analysis queue there. To diagnose on this machine instead, see
`research/flossing/maze_package/TRANSFER_README.md` — two caveats vs Sudoku: the Maze model uses
the **attention** arch (verify the JVP closures in `diagnose_trm_joint.py`), and seq_len 900 makes
the per-sample FTLE estimate ~10x slower (use n=512 for the 2x2, n=256 for the horizon sweep).
## Layout
```
trm/ hrm/ srm/ upstream clones + our pretrain.py trajectory-augmentation code
data/maze-30x30-hard-1k/ built Maze dataset (clone-and-run)
env/ setup.sh, requirements.txt, pip-freeze.txt
scripts/ dataset build + run helpers
research/flossing/ dynamical-analysis pipeline:
diagnose_*.py per-example FTLE + drift estimators (JVP+QR)
analysis_2x2/ the wandering-not-settling 2x2 analysis + phase-1 results
paper/ draft (claims table, outline, intro, results), style contract
maze_package/ portable Maze launcher + transfer notes
```
## What's NOT here
Sudoku dataset (rebuild: `bash scripts/build_datasets.sh sudoku`), all checkpoints, wandb runs,
npz/png diagnostic blobs. The committed `*.json` under research/flossing are training-history logs.
|