diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-05-04 23:05:16 -0500 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-05-04 23:05:16 -0500 |
| commit | bd9333eda60a9029a198acaeacb1eca4312bd1e8 (patch) | |
| tree | 7544c347b7ac4e8629fa1cc0fcf341d48cb69e2e /README.md | |
Initial release: GRAFT (KAFT) — NeurIPS 2026 submission code
Topology-factorized Jacobian-aligned feedback for deep GNNs. Includes:
- src/: GraphGrAPETrainer (KAFT) + BP / DFA / DFA-GNN / VanillaGrAPE baselines
+ multi-probe alignment estimator + dataset / sparse-mm utilities.
- experiments/: 19 runners reproducing every figure / table in the paper.
- figures/: 4 generators + the 4 PDFs cited in the report.
- paper/: NeurIPS .tex and consolidated experiments_master notes.
Smoke test: 50-epoch Cora GCN L=4 gives BP 77.3% / KAFT 79.0%.
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 121 |
1 files changed, 121 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..db3413b --- /dev/null +++ b/README.md @@ -0,0 +1,121 @@ +# GRAFT (KAFT): Topology-Factorized Jacobian-Aligned Feedback for Deep GNNs + +Code release accompanying the NeurIPS 2026 submission. + +## Overview + +We replace the BP backward pass in deep message-passing GNNs with a backward-only +rule whose feedback operator factors into a fixed graph polynomial +`P_l(Â) = Â^min(L-1-l, K)` and a learned feature-side matrix `R_l ∈ R^{C×d}` +fitted via multi-probe Jacobian alignment. The forward pass is unchanged. + +``` +δ_l = σ'(Z_l) ⊙ [ P_l(Â) · Ē · R_l ] +``` + +with `Ē` an optionally graph-spread output error. Hidden-layer feedback is +computed in O(1) parallel depth on GPUs. + +## Layout + +``` +src/ core method + trainers.py BPTrainer, GraphGrAPETrainer (KAFT), DFA/DFA-GNN, alignment + data.py PyG dataset loaders, normalized  / row-Â, sparse-mm helpers +experiments/ one runner per reported result block (see `## Reproducing`) +figures/ figure generators + the four rendered PDFs in the paper +paper/ neurips_v4_main.tex + experiments_master.tex (cross-reference) +``` + +## Reproducing the paper + +End-to-end runtime for every figure / table is approximately 12 GPU-hours on +a single NVIDIA A6000 (48 GB). + +```bash +# §2.3 / Fig 1: BP backward bottleneck diagnostic +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_diag_section23_v2.py +python figures/gen_fig1_diagnostic.py + +# Tables 1 & 2: backward-rule leaderboard + main accuracy sweep +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_combo_20seeds.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_hero_extras.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_pepita_baseline.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_ff_baseline.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_cafo_baseline.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_ablation_20seeds.py + +# Fig 2: Planetoid depth sweep (11 / 13 points) +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_shallow_depth.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_bp_graft_depth.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_dfagnn_depth.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_depth_extras.py +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_dblp_depth_scaling.py +python figures/gen_depth_sweep_fig.py + +# Fig 3 / Table real-world hero: 4 large graphs at L=20 +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_realworld_hero_L20.py 0 20 +python figures/gen_realworld_depth_fig.py + +# Fig 4 (depth + perturbation panels) +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_cora_perturb.py +python figures/gen_fig4_combined.py + +# WikiCS regime-boundary check (negative result) +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_wikics_paper_setup.py + +# Wall-clock + alignment-quality diagnostics +CUDA_VISIBLE_DEVICES=0 python -u experiments/run_grad_reach_20seeds.py +``` + +Run scripts from the repo root so `from src.trainers import ...` resolves. + +## Hyperparameters + +Defaults match the paper: +- Adam, lr 0.01, weight decay 5e-4 +- 200 epochs (Fig 1 diagnostic uses 100) +- hidden dim 64 +- ReLU, no LR schedule, no dropout / batch-norm / residual unless noted as a + stackability variant +- KAFT: `num_probes=64`, `align_mode='chain_norm'`, `lr_feedback=0.5`, + `max_topo_power=K=3`, `diffusion_alpha=0.5`, `diffusion_iters=10`. +- Seeds 0..19. + +## Datasets + +Auto-downloaded by `torch_geometric` on first use: +- Planetoid: Cora, CiteSeer, PubMed +- CitationFull: Cora, Cora_ML, CiteSeer, DBLP, PubMed +- Coauthor: CS, Physics +- WikiCS + +## Dependencies + +``` +torch >= 2.0 +torch_geometric >= 2.4 +torch_sparse, torch_scatter (matching torch version) +numpy, scipy, scikit-learn, matplotlib +``` + +`requirements.txt` lists the same. + +## License + +This code is released under the MIT License (see `LICENSE`). It is the +sole authorship of the corresponding author of the paper. + +## Third-party libraries + +Used as runtime dependencies, not bundled. All permissively licensed (BSD-3 / +MIT / PSF). The author has full permission to use them. + +| Library | License | +|--------------------|--------------| +| PyTorch | BSD-3 | +| PyTorch Geometric | MIT | +| scikit-learn | BSD-3 | +| NumPy | BSD | +| SciPy | BSD | +| matplotlib | PSF-equivalent | |
