summaryrefslogtreecommitdiff
path: root/research/flossing/flossing_suite/README.md
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
committerYurenHao0426 <blackhao0426@gmail.com>2026-06-13 12:35:36 -0500
commit66e0d8b9fd4d0f7a2231d689c055e26fdf1cf04a (patch)
treec29cba61124018755a19b02c9d33e3ad5f2e05cc /research/flossing/flossing_suite/README.md
rrm workspace: TRM/HRM/SRM code, Maze dataset, dynamical-analysis pipelineHEADmain
Curated export for clone-and-run Maze training (2x A6000) + diagnostics. trm/hrm pretrain.py carry trajectory-augmentation code (backward-compatible). Heavy artifacts (checkpoints/wandb/npz) gitignored; see PROVENANCE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Diffstat (limited to 'research/flossing/flossing_suite/README.md')
-rw-r--r--research/flossing/flossing_suite/README.md89
1 files changed, 89 insertions, 0 deletions
diff --git a/research/flossing/flossing_suite/README.md b/research/flossing/flossing_suite/README.md
new file mode 100644
index 0000000..6081c97
--- /dev/null
+++ b/research/flossing/flossing_suite/README.md
@@ -0,0 +1,89 @@
+# Flossing Suite
+
+This directory is a reproducible wrapper around the existing flossing code.
+It separates three questions:
+
+1. Does the Engelken algorithm itself reproduce on a vanilla RNN toy task?
+2. Does a faithful pre/interfloss analogue help TRM/HRM when the flossing phase is separate from task loss?
+3. Do our one-sided variants (`top1_cf`, `spectrum_cf`, `volume_cf`) behave differently from Engelken's two-sided L2 target?
+
+## Important Algorithmic Distinctions
+
+- `engelken_python_flossing.py` is the toy RNN faithful port. It keeps the paper-style separate flossing phase, no task/floss mixed objective, flosses only input/recurrent/bias parameters, uses differentiable QR, and optimizes `mean((lambda_i - lambda_star)^2)`.
+- `step7_interfloss.py` is the HRM/TRM analogue. It also uses separate floss-only episodes. Ordinary training steps use only supervised ACT loss.
+- `step7_interfloss.py --floss-mode engelken_l2` is the Rainer-style two-sided target.
+- `top1_cf`, `spectrum_cf`, and `volume_cf` are our one-sided contractive variants, not the paper method.
+- KL preservation is optional and only applies during floss-only episodes.
+
+## Current Known Sanity Check
+
+Existing toy RNN result:
+
+- Baseline no floss: final eval accuracy about `0.777`.
+- Prefloss: final eval accuracy about `0.997`.
+
+This means the Python port can reproduce a positive toy result. Negative HRM/TRM results should therefore be interpreted as model/task transfer issues, not simply "flossing code cannot work."
+
+## Recommended Workflow
+
+Run smoke tests:
+
+```bash
+bash research/flossing/flossing_suite/smoke_test.sh 0
+```
+
+Launch toy RNN paper-style suite:
+
+```bash
+bash research/flossing/flossing_suite/launch_toy_official_suite.sh
+```
+
+Launch TRM faithful Rainer-style suite:
+
+```bash
+GPU_BASE=0 GPU_PREFLOSS=1 GPU_INTER=3 bash research/flossing/flossing_suite/launch_trm_faithful_suite.sh
+```
+
+Launch TRM CF/volume variants:
+
+```bash
+GPU_TOP1=0 GPU_VOLUME=1 GPU_KL=3 bash research/flossing/flossing_suite/launch_trm_variant_suite.sh
+```
+
+Summarize all available flossing logs:
+
+```bash
+/home/yurenh2/miniconda3/envs/rrm/bin/python research/flossing/flossing_suite/summarize_flossing.py
+```
+
+Check active jobs:
+
+```bash
+bash research/flossing/flossing_suite/status.sh
+```
+
+Wait for current TRM faithful jobs and refresh summary:
+
+```bash
+bash research/flossing/flossing_suite/watch_and_summarize.sh
+```
+
+Outputs go to:
+
+- `results/toy_rnn/`
+- `results/trm_faithful/`
+- `results/trm_variants/`
+- `results/smoke/`
+- `results/summary/`
+
+## Existing Historical Results Included By Summarizer
+
+The summarizer also scans:
+
+- `research/flossing/engelken_python/*.json`
+- `research/flossing/engelken_paper_faithful/*.json`
+- `research/flossing/step6_*.json`
+- `research/flossing/step7_*.json`
+- `research/flossing/flossing_suite/results/**/*.json`
+
+This keeps old negative/positive evidence visible without rerunning everything.