diff options
Diffstat (limited to 'research/flossing/flossing_suite/README.md')
| -rw-r--r-- | research/flossing/flossing_suite/README.md | 89 |
1 files changed, 89 insertions, 0 deletions
diff --git a/research/flossing/flossing_suite/README.md b/research/flossing/flossing_suite/README.md new file mode 100644 index 0000000..6081c97 --- /dev/null +++ b/research/flossing/flossing_suite/README.md @@ -0,0 +1,89 @@ +# Flossing Suite + +This directory is a reproducible wrapper around the existing flossing code. +It separates three questions: + +1. Does the Engelken algorithm itself reproduce on a vanilla RNN toy task? +2. Does a faithful pre/interfloss analogue help TRM/HRM when the flossing phase is separate from task loss? +3. Do our one-sided variants (`top1_cf`, `spectrum_cf`, `volume_cf`) behave differently from Engelken's two-sided L2 target? + +## Important Algorithmic Distinctions + +- `engelken_python_flossing.py` is the toy RNN faithful port. It keeps the paper-style separate flossing phase, no task/floss mixed objective, flosses only input/recurrent/bias parameters, uses differentiable QR, and optimizes `mean((lambda_i - lambda_star)^2)`. +- `step7_interfloss.py` is the HRM/TRM analogue. It also uses separate floss-only episodes. Ordinary training steps use only supervised ACT loss. +- `step7_interfloss.py --floss-mode engelken_l2` is the Rainer-style two-sided target. +- `top1_cf`, `spectrum_cf`, and `volume_cf` are our one-sided contractive variants, not the paper method. +- KL preservation is optional and only applies during floss-only episodes. + +## Current Known Sanity Check + +Existing toy RNN result: + +- Baseline no floss: final eval accuracy about `0.777`. +- Prefloss: final eval accuracy about `0.997`. + +This means the Python port can reproduce a positive toy result. Negative HRM/TRM results should therefore be interpreted as model/task transfer issues, not simply "flossing code cannot work." + +## Recommended Workflow + +Run smoke tests: + +```bash +bash research/flossing/flossing_suite/smoke_test.sh 0 +``` + +Launch toy RNN paper-style suite: + +```bash +bash research/flossing/flossing_suite/launch_toy_official_suite.sh +``` + +Launch TRM faithful Rainer-style suite: + +```bash +GPU_BASE=0 GPU_PREFLOSS=1 GPU_INTER=3 bash research/flossing/flossing_suite/launch_trm_faithful_suite.sh +``` + +Launch TRM CF/volume variants: + +```bash +GPU_TOP1=0 GPU_VOLUME=1 GPU_KL=3 bash research/flossing/flossing_suite/launch_trm_variant_suite.sh +``` + +Summarize all available flossing logs: + +```bash +/home/yurenh2/miniconda3/envs/rrm/bin/python research/flossing/flossing_suite/summarize_flossing.py +``` + +Check active jobs: + +```bash +bash research/flossing/flossing_suite/status.sh +``` + +Wait for current TRM faithful jobs and refresh summary: + +```bash +bash research/flossing/flossing_suite/watch_and_summarize.sh +``` + +Outputs go to: + +- `results/toy_rnn/` +- `results/trm_faithful/` +- `results/trm_variants/` +- `results/smoke/` +- `results/summary/` + +## Existing Historical Results Included By Summarizer + +The summarizer also scans: + +- `research/flossing/engelken_python/*.json` +- `research/flossing/engelken_paper_faithful/*.json` +- `research/flossing/step6_*.json` +- `research/flossing/step7_*.json` +- `research/flossing/flossing_suite/results/**/*.json` + +This keeps old negative/positive evidence visible without rerunning everything. |
