blob: d9812a4265ee184b180198235a74bf99782550f7 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
|
=== full + component toggles (ms/step, B=24, C512) ===
/home/yurenh2/miniconda3/lib/python3.13/site-packages/torch/autograd/graph.py:865: UserWarning: Attempting to run cuBLAS, but there was no current CUDA context! Attempting to set the primary context... (Triggered internally at /pytorch/aten/src/ATen/cuda/CublasHandlePool.cpp:330.)
return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
FULL ep_step: 7266
-jacreg: 7242
-resreg: 7312
-t1max(no refine): 5886
t2sel=80: 7384
t2sel=40: 4485
plain nudge holo=0 T2=20: 3179
free relax T1=150 alone: 740
free relax T1=300 alone: 1480
=== batch sweep (full) ===
B=8: 2353 ms (294.1/sample)
B=24: 7405 ms (308.5/sample)
B=48: 14496 ms (302.0/sample)
=== compile free relax ===
free relax T1=150 COMPILED: 507
=== bf16 full ===
full bf16: ERR RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
DONE
|