# Flossing Suite This directory is a reproducible wrapper around the existing flossing code. It separates three questions: 1. Does the Engelken algorithm itself reproduce on a vanilla RNN toy task? 2. Does a faithful pre/interfloss analogue help TRM/HRM when the flossing phase is separate from task loss? 3. Do our one-sided variants (`top1_cf`, `spectrum_cf`, `volume_cf`) behave differently from Engelken's two-sided L2 target? ## Important Algorithmic Distinctions - `engelken_python_flossing.py` is the toy RNN faithful port. It keeps the paper-style separate flossing phase, no task/floss mixed objective, flosses only input/recurrent/bias parameters, uses differentiable QR, and optimizes `mean((lambda_i - lambda_star)^2)`. - `step7_interfloss.py` is the HRM/TRM analogue. It also uses separate floss-only episodes. Ordinary training steps use only supervised ACT loss. - `step7_interfloss.py --floss-mode engelken_l2` is the Rainer-style two-sided target. - `top1_cf`, `spectrum_cf`, and `volume_cf` are our one-sided contractive variants, not the paper method. - KL preservation is optional and only applies during floss-only episodes. ## Current Known Sanity Check Existing toy RNN result: - Baseline no floss: final eval accuracy about `0.777`. - Prefloss: final eval accuracy about `0.997`. This means the Python port can reproduce a positive toy result. Negative HRM/TRM results should therefore be interpreted as model/task transfer issues, not simply "flossing code cannot work." ## Recommended Workflow Run smoke tests: ```bash bash research/flossing/flossing_suite/smoke_test.sh 0 ``` Launch toy RNN paper-style suite: ```bash bash research/flossing/flossing_suite/launch_toy_official_suite.sh ``` Launch TRM faithful Rainer-style suite: ```bash GPU_BASE=0 GPU_PREFLOSS=1 GPU_INTER=3 bash research/flossing/flossing_suite/launch_trm_faithful_suite.sh ``` Launch TRM CF/volume variants: ```bash GPU_TOP1=0 GPU_VOLUME=1 GPU_KL=3 bash research/flossing/flossing_suite/launch_trm_variant_suite.sh ``` Summarize all available flossing logs: ```bash /home/yurenh2/miniconda3/envs/rrm/bin/python research/flossing/flossing_suite/summarize_flossing.py ``` Check active jobs: ```bash bash research/flossing/flossing_suite/status.sh ``` Wait for current TRM faithful jobs and refresh summary: ```bash bash research/flossing/flossing_suite/watch_and_summarize.sh ``` Outputs go to: - `results/toy_rnn/` - `results/trm_faithful/` - `results/trm_variants/` - `results/smoke/` - `results/summary/` ## Existing Historical Results Included By Summarizer The summarizer also scans: - `research/flossing/engelken_python/*.json` - `research/flossing/engelken_paper_faithful/*.json` - `research/flossing/step6_*.json` - `research/flossing/step7_*.json` - `research/flossing/flossing_suite/results/**/*.json` This keeps old negative/positive evidence visible without rerunning everything.