diff options
| author | Yuren Hao <yurenh2@illinois.edu> | 2026-07-03 05:58:53 -0500 |
|---|---|---|
| committer | Yuren Hao <yurenh2@illinois.edu> | 2026-07-03 05:58:53 -0500 |
| commit | b7fab6a524c4c5cd29aaf9933fb150e7b7902a3f (patch) | |
| tree | 210ded126ee865d1127c0f3727b0529f3e58b102 /ONBOARDING.md | |
| parent | b83947778e2c776f757a07d4719b7ce961d7ed55 (diff) | |
ONBOARDING: add data/checkpoint sharing note (git-ignored large files)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_014FAPDWQ49M5Ye3NpTndTpn
Diffstat (limited to 'ONBOARDING.md')
| -rw-r--r-- | ONBOARDING.md | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/ONBOARDING.md b/ONBOARDING.md index dfa758b..f8f7dee 100644 --- a/ONBOARDING.md +++ b/ONBOARDING.md @@ -80,6 +80,12 @@ Diagnostics: add `--diag_cos 500` (log cos-to-BPTT over training) · `--init_ckp operator's 4-D fingerprint) · `--eigreg 0.1 --eig_margin 1.0` (leading-abscissa control, alt to `--jacreg`). BP baseline (fair control): `--mode bptt`. **All experiment processes must use `nohup`.** +**Getting the data & checkpoints (git-ignored — not in this repo):** +- **Data** (`ep_run/data/tinystories_bpe/`, ~712 MB): regenerate from the BPE tokenizer pipeline in `ep_run/` (build + the tokenizer + tokenize TinyStories → `train.bin` / `val.bin` / `meta.pkl`), or copy from the shared location. +- **Checkpoints** (`ep_run/runs/*.pt`, e.g. `redx_traj/s2000.pt` for warm-starting): ask Yuren for a share link — + too large for git. `s2000.pt` is the stable warm-start operator (see §5). + ## 8. Deeper docs (organized under `docs/`) - **`docs/method/`** — `METHODS.md`, `EP_DERIVATION.md` (the EP/AsymEP gradient derivation), `ARCHITECTURE.md` (implementation detail; older energy-formulation, partly superseded by §2 above), `READING.md`. |
