diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-06-13 12:35:36 -0500 |
| commit | 66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch) | |
| tree | c29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh | |
Curated export for clone-and-run Maze training (2x A6000) + diagnostics.
trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible).
Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh')
| -rwxr-xr-x | research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh | 55 |
1 files changed, 55 insertions, 0 deletions
diff --git a/research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh b/research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh new file mode 100755 index 0000000..4a44936 --- /dev/null +++ b/research/flossing/directional_lyap_perturb_eps_sweep/watch_queue_and_plot.sh @@ -0,0 +1,55 @@ +#!/usr/bin/env bash +set -euo pipefail + +ROOT=/home/yurenh2/rrm +PY=/home/yurenh2/miniconda3/envs/rrm/bin/python +cd "${ROOT}" + +DIR=research/flossing/directional_lyap_perturb_eps_sweep +BASE_CKPT="${ROOT}/trm/checkpoints/Sudoku-extreme-1k-aug-1000-ACT-torch/pretrain_mlp_t_sudoku_official_gbs768_repro" +MULTI_CKPT="${ROOT}/trm/checkpoints/Sudoku-extreme-1k-aug-1000-ACT-torch/pretrain_mlp_t_sudoku_official_gbs768_multi4_loguniform_repro" +SIGMAS='0,0.001,0.003,0.01,0.03,0.1' +AFTERS='0,4,8,12' +N=1000 +CAND=8 +BS=16 +SEED=20260608 + +wait_pid_file() { + local pf="$1" + local pid + pid=$(cat "${pf}") + echo "watch ${pf}: ${pid}" + while kill -0 "${pid}" 2>/dev/null; do + sleep 60 + done + echo "done ${pf}: ${pid}" +} + +launch_multi_eps003_on_gpu0() { + local out=trm_multi4_best_step35805_n1000_c8_fdeps003 + local log="${DIR}/logs/${out}.log" + local pidf="${DIR}/logs/${out}.pid" + setsid bash -c "cd '${ROOT}' && export CUDA_VISIBLE_DEVICES='0' PYTHONUNBUFFERED=1 && exec '${PY}' research/flossing/directional_lyap_perturb_robustness.py --ckpt-root '${MULTI_CKPT}' --ckpt-name step_35805 --label trm_multi4_best --n-samples '${N}' --batch-size '${BS}' --candidates '${CAND}' --fd-eps 0.03 --sigmas '${SIGMAS}' --perturb-afters '${AFTERS}' --seed '${SEED}' --out-prefix '${DIR}/${out}'" \ + > "${log}" 2>&1 < /dev/null & + echo $! > "${pidf}" + echo "launched ${out}: $(cat "${pidf}")" +} + +wait_pid_file "${DIR}/logs/trm_baseline_best_step58590_n1000_c8_fdeps001.pid" +launch_multi_eps003_on_gpu0 +wait_pid_file "${DIR}/logs/trm_multi4_best_step35805_n1000_c8_fdeps001.pid" +wait_pid_file "${DIR}/logs/trm_baseline_best_step58590_n1000_c8_fdeps003.pid" +wait_pid_file "${DIR}/logs/trm_multi4_best_step35805_n1000_c8_fdeps003.pid" + +"${PY}" research/flossing/plot_directional_lyap_perturb.py \ + --summaries \ + "${DIR}/trm_baseline_best_step58590_n1000_c8_fdeps001.summary.csv" \ + "${DIR}/trm_multi4_best_step35805_n1000_c8_fdeps001.summary.csv" \ + "${DIR}/trm_baseline_best_step58590_n1000_c8_fdeps003.summary.csv" \ + "${DIR}/trm_multi4_best_step35805_n1000_c8_fdeps003.summary.csv" \ + --out-dir "${DIR}/plots" \ + --slice-sigma 0.03 + +nvidia-smi --query-gpu=index,memory.used,memory.total,utilization.gpu --format=csv,noheader,nounits \ + > "${DIR}/plots/final_gpu_status.txt" |
