summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/14632859_speedup.out
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:50:59 -0600
committerYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:50:59 -0600
commit00cf667cee7ffacb144d5805fc7e0ef443f3583a (patch)
tree77d20a3adaecf96bf3aff0612bdd3b5fa1a7dc7e /runs/slurm_logs/14632859_speedup.out
parentc53c04aa1d6ff75cb478a9498c370baa929c74b6 (diff)
parentcd99d6b874d9d09b3bb87b8485cc787885af71f1 (diff)
Merge master into main
Diffstat (limited to 'runs/slurm_logs/14632859_speedup.out')
-rw-r--r--runs/slurm_logs/14632859_speedup.out55
1 files changed, 55 insertions, 0 deletions
diff --git a/runs/slurm_logs/14632859_speedup.out b/runs/slurm_logs/14632859_speedup.out
new file mode 100644
index 0000000..8589010
--- /dev/null
+++ b/runs/slurm_logs/14632859_speedup.out
@@ -0,0 +1,55 @@
+============================================================
+Lyapunov Speedup Benchmark
+Job ID: 14632859 | Node: gpub073
+Start: Tue Dec 30 06:43:26 CST 2025
+============================================================
+NVIDIA A40, 46068 MiB
+============================================================
+================================================================================
+LYAPUNOV COMPUTATION SPEEDUP BENCHMARK
+================================================================================
+Batch size: 64
+Timesteps: 4
+Hidden dims: [64, 128, 256]
+Device: cuda
+================================================================================
+
+[1/6] Benchmarking Baseline...
+[2/6] Benchmarking Approach A (batched)...
+[3/6] Benchmarking Approach B (global renorm)...
+[4/6] Benchmarking Approach A+B (combined)...
+[5/6] Benchmarking Approach C (compiled baseline)...
+[6/6] Benchmarking A+B+C (all optimizations)...
+ torch.compile failed: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2, 64, 256]], which is output 0 of torch::autograd::CopySlices, is at version 1; expected version 0 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
+
+================================================================================
+RESULTS
+================================================================================
+Baseline | Fwd: 10.28ms | Bwd: 7.75ms | Total: 18.03ms | λ: +1.0693 | Mem: 22.6MB
+A: Batched trajectories | Fwd: 7.20ms | Bwd: 7.43ms | Total: 14.63ms | λ: +1.0849 | Mem: 23.7MB
+B: Global renorm | Fwd: 9.49ms | Bwd: 6.99ms | Total: 16.48ms | λ: +0.6573 | Mem: 25.0MB
+A+B: Combined | Fwd: 6.55ms | Bwd: 6.76ms | Total: 13.30ms | λ: +0.6575 | Mem: 26.1MB
+C: Compiled baseline | Fwd: 8150.22ms | Bwd: 7502.86ms | Total: 15653.07ms | λ: +1.0758 | Mem: 44.5MB
+A+B+C: All optimized | Fwd: 0.00ms | Bwd: 0.00ms | Total: 0.00ms | λ: +0.0000 | Mem: 0.0MB
+
+--------------------------------------------------------------------------------
+SPEEDUP vs BASELINE:
+--------------------------------------------------------------------------------
+ A: Batched trajectories : 1.23x
+ B: Global renorm : 1.09x
+ A+B: Combined : 1.36x
+ C: Compiled baseline : 0.00x
+
+--------------------------------------------------------------------------------
+LYAPUNOV VALUE CONSISTENCY CHECK:
+--------------------------------------------------------------------------------
+ A: Batched trajectories : λ=+1.0849 (diff=0.0156) ✓
+ B: Global renorm : λ=+0.6573 (diff=0.4119) ✗
+ A+B: Combined : λ=+0.6575 (diff=0.4117) ✗
+ C: Compiled baseline : λ=+1.0758 (diff=0.0065) ✓
+
+================================================================================
+SCALING TESTS
+================================================================================
+Config | Baseline | A+B | Speedup
+--------------------------------------------------------------------------------