summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'README.md')
-rw-r--r--README.md150
1 files changed, 150 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..2bbf938
--- /dev/null
+++ b/README.md
@@ -0,0 +1,150 @@
+# RRoG-GNN
+
+This is the working repo for RRoG/TRM-on-GNN experiments. It contains the
+current experiment runner, diagnostic prototypes, paper references, scripts, and
+tracked JSON/log/summary outputs.
+
+Core rule:
+
+```text
+view/graph aggregation happens once; recursive compute is edge-free hidden-state refinement.
+```
+
+The main reported table is:
+
+```text
+Task x Backbone -> classic baseline
+Task x Backbone x fixed-RRoG -> delta against the matching classic row
+```
+
+`classic` is the non-RRoG baseline for every backbone: `T=0`, `n_sup=1`.
+
+## One-command Run On 2x A6000
+
+On a clean machine with two visible GPUs:
+
+```bash
+git clone git@github.com:YurenHao0426/rrog.git
+cd rrog
+./scripts/setup_and_run_two_a6000.sh
+```
+
+Defaults:
+
+- GPU0: `zinc-cycle56` over 17 backbones, `classic + fixed-rrog`
+- GPU1: `ogbg-molhiv` over 17 backbones, `classic + fixed-rrog`
+- Results: `runs/*.json`
+- Logs: `logs/*.log`
+- Summaries: `summaries/*.md`
+
+If the environment already has compatible `torch`, `torch_geometric`, and `ogb`:
+
+```bash
+SKIP_SETUP=1 ./scripts/setup_and_run_two_a6000.sh
+```
+
+To override CUDA wheel index during setup:
+
+```bash
+TORCH_INDEX_URL=https://download.pytorch.org/whl/cu121 ./scripts/setup_env.sh
+```
+
+## Common Commands
+
+Smoke test:
+
+```bash
+./scripts/setup_env.sh
+DEVICE=cuda:0 ./scripts/run_smoke.sh
+```
+
+Run the paired ZINC matrix only:
+
+```bash
+DEVICE=cuda:0 EPOCHS=200 ./scripts/run_zinc_cycle56_full.sh
+```
+
+Run one OGB molecular task:
+
+```bash
+TASK=ogbg-molhiv DEVICE=cuda:1 EPOCHS=100 ./scripts/run_ogb_mol_task_full.sh
+```
+
+Run the same OGB task with the lighter fixed recursion used by the ZINC sweep:
+
+```bash
+TASK=ogbg-molhiv DEVICE=cuda:1 EPOCHS=100 FIXED_T=1 FIXED_NS=3 ./scripts/run_ogb_mol_task_full.sh
+```
+
+Run all selected OGB molecular tasks serially on one GPU:
+
+```bash
+DEVICE=cuda:1 ./scripts/run_ogb_mol_all_tasks.sh
+```
+
+Run the corrected stream-ACT sweep on two GPUs:
+
+```bash
+EPOCHS=100 SEEDS=0 ./scripts/run_ogb_act_two_gpu.sh
+```
+
+Defaults:
+
+- GPU0: `ogbg-molhiv ogbg-molbbbp ogbg-molsider ogbg-molbace`
+- GPU1: `ogbg-molesol ogbg-mollipo ogbg-moltox21 ogbg-molclintox`
+- Every task runs all 17 backbones.
+- ACT config: `T=1`, `n_sup=3`, `halt_max=8`, `halt_min=2`, `halt_target=loss`, `loss_threshold=0.2`, `halt_exploration=0.1`, `lam_q=0.1`, `q_warmup=0`, `act_train_mode=stream`.
+
+Optional ACT variants:
+
+```bash
+# Add FreeSolv as a separate regression stress test.
+TASKS_GPU0="ogbg-molfreesolv" TASKS_GPU1="" ./scripts/run_ogb_act_two_gpu.sh
+
+# Classification-only exact-halt target.
+TASKS_GPU0="ogbg-molhiv ogbg-molbbbp ogbg-molsider" \
+TASKS_GPU1="ogbg-molbace ogbg-moltox21 ogbg-molclintox" \
+HALT_TARGET=exact ./scripts/run_ogb_act_two_gpu.sh
+
+# More robust but longer seed sweep.
+SEEDS="0 1 2" ./scripts/run_ogb_act_two_gpu.sh
+```
+
+Collect summaries:
+
+```bash
+./scripts/collect_results.sh
+```
+
+## Backbones
+
+The implemented 2D view/backbone list is shared across ZINC and OGB:
+
+```text
+gin, gine, gcn, graphsage, gatv2, graphconv, transformer, pna,
+gen, film, resgated, tag, sgc, cheb, arma, mf, appnp
+```
+
+For ZINC `gine`, there are no bond features, so GINE uses a learned constant edge token.
+For OGB molecular tasks, GINE and edge-aware backbones use OGB bond encodings.
+
+## Notes
+
+- Runs are resumable at the cell level: scripts skip existing expected JSON files.
+- ZINC cycle-count cache is generated under `data/cycle_cache`.
+- OGB datasets are downloaded under `data/ogb`.
+- Override data/runs locations with `RROG_DATA_DIR` and `RROG_RUNS_DIR`.
+- Large artifacts are intentionally not versioned in Git: `data/`, checkpoints,
+ `*.pt`, and `*.pth` are ignored. Use external storage or Git LFS if those need
+ to be shared.
+
+## Upload Results
+
+After a remote machine finishes:
+
+```bash
+git pull
+git add -f runs/*.json logs/*.log summaries/*.md
+git commit -m "Add stream ACT OGB results"
+git push
+```