summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112873_stable_init.out
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:49:05 -0600
committerYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:49:05 -0600
commitcd99d6b874d9d09b3bb87b8485cc787885af71f1 (patch)
tree59a233959932ca0e4f12f196275e07fcf443b33f /runs/slurm_logs/15112873_stable_init.out
init commit
Diffstat (limited to 'runs/slurm_logs/15112873_stable_init.out')
-rw-r--r--runs/slurm_logs/15112873_stable_init.out207
1 files changed, 207 insertions, 0 deletions
diff --git a/runs/slurm_logs/15112873_stable_init.out b/runs/slurm_logs/15112873_stable_init.out
new file mode 100644
index 0000000..eacc079
--- /dev/null
+++ b/runs/slurm_logs/15112873_stable_init.out
@@ -0,0 +1,207 @@
+============================================================
+STABLE INITIALIZATION Experiment
+Job ID: 15112873 | Node: gpub011
+Start: Thu Jan 1 12:26:50 CST 2026
+============================================================
+NVIDIA A40, 46068 MiB
+============================================================
+================================================================================
+DEPTH SCALING BENCHMARK
+================================================================================
+Dataset: cifar100
+Depths: [4, 8, 12, 16]
+Timesteps: 4
+Epochs: 150
+λ_reg: 0.1, λ_target: -0.1
+Reg type: squared, Warmup epochs: 20
+Device: cuda
+================================================================================
+
+Loading cifar100...
+Classes: 100, Input: (3, 32, 32)
+Train: 50000, Test: 10000
+
+Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
+Regularization type: squared
+Warmup epochs: 20
+Stable init: True
+
+============================================================
+Depth = 4 conv layers (4 stages × 1 blocks)
+============================================================
+ Vanilla: depth=4, params=1,756,836
+ Epoch 10: train=0.516 test=0.431 σ=9.10e-01/3.50e-08
+ Epoch 20: train=0.640 test=0.517 σ=5.84e-01/2.47e-08
+ Epoch 30: train=0.712 test=0.558 σ=4.82e-01/2.04e-08
+ Epoch 40: train=0.761 test=0.558 σ=4.07e-01/1.72e-08
+ Epoch 50: train=0.800 test=0.577 σ=3.76e-01/1.54e-08
+ Epoch 60: train=0.837 test=0.581 σ=3.34e-01/1.38e-08
+ Epoch 70: train=0.864 test=0.579 σ=3.25e-01/1.29e-08
+ Epoch 80: train=0.888 test=0.592 σ=2.91e-01/1.17e-08
+ Epoch 90: train=0.907 test=0.602 σ=2.89e-01/1.10e-08
+ Epoch 100: train=0.921 test=0.604 σ=2.71e-01/1.05e-08
+ Epoch 110: train=0.935 test=0.606 σ=2.64e-01/9.89e-09
+ Epoch 120: train=0.943 test=0.617 σ=2.46e-01/9.41e-09
+ Epoch 130: train=0.950 test=0.615 σ=2.45e-01/8.86e-09
+ Epoch 140: train=0.951 test=0.615 σ=2.29e-01/8.67e-09
+ Epoch 150: train=0.953 test=0.615 σ=2.36e-01/8.51e-09
+ Best test acc: 0.620
+ Lyapunov: depth=4, params=1,756,836
+ Epoch 10: train=0.195 test=0.014 λ=1.549 σ=7.07e-01/2.57e-08
+ Epoch 20: train=0.135 test=0.012 λ=1.570 σ=4.14e-01/1.49e-08
+ Epoch 30: train=0.057 test=0.010 λ=1.488 σ=2.46e-01/6.99e-09
+ Epoch 40: train=0.067 test=0.010 λ=1.481 σ=2.03e-01/6.20e-09
+ Epoch 50: train=0.048 test=0.010 λ=1.877 σ=1.80e-01/4.00e-09
+ Epoch 60: train=0.009 test=0.010 λ=1.462 σ=4.58e-02/0.00e+00
+ Epoch 70: train=0.010 test=0.010 λ=1.467 σ=3.57e-02/0.00e+00
+ Epoch 80: train=0.010 test=0.010 λ=1.471 σ=1.33e-02/0.00e+00
+ Epoch 90: train=0.009 test=0.010 λ=1.471 σ=4.82e-03/0.00e+00
+ Epoch 100: train=0.009 test=0.010 λ=1.471 σ=1.18e-03/0.00e+00
+ Epoch 110: train=0.009 test=0.010 λ=1.471 σ=4.32e-03/0.00e+00
+ Epoch 120: train=0.009 test=0.010 λ=1.472
+ Epoch 130: train=0.010 test=0.010 λ=1.472
+ Epoch 140: train=0.010 test=0.010 λ=1.471
+ Epoch 150: train=0.010 test=0.010 λ=1.473
+ Best test acc: 0.106
+
+============================================================
+Depth = 8 conv layers (4 stages × 2 blocks)
+============================================================
+ Vanilla: depth=8, params=4,892,196
+ Epoch 10: train=0.451 test=0.402 σ=7.31e-01/2.99e-08
+ Epoch 20: train=0.587 test=0.471 σ=4.69e-01/2.12e-08
+ Epoch 30: train=0.666 test=0.493 σ=3.81e-01/1.75e-08
+ Epoch 40: train=0.728 test=0.505 σ=3.27e-01/1.53e-08
+ Epoch 50: train=0.774 test=0.533 σ=3.18e-01/1.40e-08
+ Epoch 60: train=0.812 test=0.521 σ=2.93e-01/1.28e-08
+ Epoch 70: train=0.852 test=0.547 σ=2.81e-01/1.17e-08
+ Epoch 80: train=0.884 test=0.531 σ=2.48e-01/1.02e-08
+ Epoch 90: train=0.906 test=0.537 σ=2.35e-01/9.46e-09
+ Epoch 100: train=0.927 test=0.553 σ=2.24e-01/8.84e-09
+ Epoch 110: train=0.941 test=0.552 σ=2.09e-01/8.04e-09
+ Epoch 120: train=0.951 test=0.553 σ=2.09e-01/7.55e-09
+ Epoch 130: train=0.959 test=0.553 σ=2.10e-01/7.39e-09
+ Epoch 140: train=0.959 test=0.561 σ=1.95e-01/7.19e-09
+ Epoch 150: train=0.961 test=0.551 σ=1.94e-01/6.97e-09
+ Best test acc: 0.564
+ Lyapunov: depth=8, params=4,892,196
+ Epoch 10: train=0.046 test=0.010 λ=1.543 σ=3.90e-01/9.92e-09
+ Epoch 20: train=0.038 test=0.010 λ=1.533 σ=2.42e-01/4.88e-09
+ Epoch 30: train=0.038 test=0.010 λ=1.623 σ=1.93e-01/3.39e-09
+ Epoch 40: train=0.028 test=0.010 λ=1.706 σ=1.66e-01/2.06e-09
+ Epoch 50: train=0.009 test=0.010 λ=1.532 σ=7.89e-02/1.54e-17
+ Epoch 60: train=0.010 test=0.010 λ=1.540 σ=4.28e-02/5.11e-27
+ Epoch 70: train=0.009 test=0.010 λ=1.544 σ=4.22e-02/0.00e+00
+ Epoch 80: train=0.010 test=0.010 λ=1.548 σ=3.81e-02/0.00e+00
+ Epoch 90: train=0.011 test=0.010 λ=1.554 σ=3.03e-02/0.00e+00
+ Epoch 100: train=0.010 test=0.010 λ=1.549 σ=9.40e-03/0.00e+00
+ Epoch 110: train=0.010 test=0.010 λ=1.549 σ=5.91e-03/0.00e+00
+ Epoch 120: train=0.010 test=0.010 λ=1.548 σ=3.83e-03/0.00e+00
+ Epoch 130: train=0.010 test=0.010 λ=1.549 σ=7.81e-03/0.00e+00
+ Epoch 140: train=0.010 test=0.010 λ=1.549 σ=1.37e-02/0.00e+00
+ Epoch 150: train=0.010 test=0.010 λ=1.546 σ=8.69e-03/0.00e+00
+ Best test acc: 0.021
+
+============================================================
+Depth = 12 conv layers (4 stages × 3 blocks)
+============================================================
+ Vanilla: depth=12, params=8,027,556
+ Epoch 10: train=0.253 test=0.046 σ=4.96e-01/2.03e-08
+ Epoch 20: train=0.322 test=0.044 σ=3.35e-01/1.58e-08
+ Epoch 30: train=0.364 test=0.054 σ=2.77e-01/1.38e-08
+ Epoch 40: train=0.404 test=0.046 σ=2.49e-01/1.30e-08
+ Epoch 50: train=0.439 test=0.062 σ=2.30e-01/1.24e-08
+ Epoch 60: train=0.469 test=0.040 σ=2.30e-01/1.24e-08
+ Epoch 70: train=0.498 test=0.054 σ=2.35e-01/1.21e-08
+ Epoch 80: train=0.532 test=0.058 σ=2.26e-01/1.20e-08
+ Epoch 90: train=0.565 test=0.072 σ=2.26e-01/1.18e-08
+ Epoch 100: train=0.276 test=0.099 σ=1.92e-01/1.10e-08
+ Epoch 110: train=0.409 test=0.123 σ=2.13e-01/1.20e-08
+ Epoch 120: train=0.470 test=0.124 σ=2.27e-01/1.20e-08
+ Epoch 130: train=0.495 test=0.146 σ=2.19e-01/1.22e-08
+ Epoch 140: train=0.510 test=0.138 σ=2.15e-01/1.17e-08
+ Epoch 150: train=0.512 test=0.118 σ=2.18e-01/1.17e-08
+ Best test acc: 0.146
+ Lyapunov: depth=12, params=8,027,556
+ Epoch 10: train=0.011 test=0.010 λ=1.563 σ=5.46e-01/7.17e-09
+ Epoch 20: train=0.010 test=0.010 λ=1.556 σ=8.74e-02/8.70e-15
+ Epoch 30: train=0.010 test=0.010 λ=1.554 σ=9.58e-02/3.05e-15
+ Epoch 40: train=0.009 test=0.010 λ=1.566 σ=6.06e-02/2.31e-34
+ Epoch 50: train=0.010 test=0.010 λ=1.566 σ=3.46e-02/0.00e+00
+ Epoch 60: train=0.009 test=0.010 λ=1.573 σ=4.50e-02/0.00e+00
+ Epoch 70: train=0.010 test=0.010 λ=1.572 σ=1.34e-02/0.00e+00
+ Epoch 80: train=0.009 test=0.010 λ=1.575 σ=6.32e-04/0.00e+00
+ Epoch 90: train=0.009 test=0.010 λ=1.576 σ=5.51e-02/0.00e+00
+ Epoch 100: train=0.010 test=0.010 λ=1.579 σ=2.74e-02/0.00e+00
+ Epoch 110: train=0.009 test=0.010 λ=1.575 σ=2.56e-02/0.00e+00
+ Epoch 120: train=0.010 test=0.010 λ=1.576 σ=3.61e-02/0.00e+00
+ Epoch 130: train=0.010 test=0.010 λ=1.576
+ Epoch 140: train=0.010 test=0.010 λ=1.574 σ=5.40e-03/0.00e+00
+ Epoch 150: train=0.010 test=0.010 λ=1.569
+ Best test acc: 0.011
+
+============================================================
+Depth = 16 conv layers (4 stages × 4 blocks)
+============================================================
+ Vanilla: depth=16, params=11,162,916
+ Epoch 10: train=0.120 test=0.020 σ=4.06e-01/1.45e-08
+ Epoch 20: train=0.158 test=0.011 σ=2.71e-01/1.13e-08
+ Epoch 30: train=0.182 test=0.016 σ=2.16e-01/1.00e-08
+ Epoch 40: train=0.203 test=0.029 σ=2.01e-01/9.74e-09
+ Epoch 50: train=0.220 test=0.025 σ=1.83e-01/9.59e-09
+ Epoch 60: train=0.237 test=0.025 σ=1.78e-01/9.64e-09
+ Epoch 70: train=0.250 test=0.029 σ=1.67e-01/9.64e-09
+ Epoch 80: train=0.259 test=0.026 σ=1.65e-01/9.31e-09
+ Epoch 90: train=0.273 test=0.022 σ=1.63e-01/9.65e-09
+ Epoch 100: train=0.229 test=0.019 σ=1.52e-01/9.12e-09
+ Epoch 110: train=0.256 test=0.024 σ=1.54e-01/9.41e-09
+ Epoch 120: train=0.266 test=0.025 σ=1.60e-01/9.49e-09
+ Epoch 130: train=0.277 test=0.025 σ=1.57e-01/9.48e-09
+ Epoch 140: train=0.283 test=0.025 σ=1.61e-01/9.66e-09
+ Epoch 150: train=0.283 test=0.024 σ=1.63e-01/9.63e-09
+ Best test acc: 0.036
+ Lyapunov: depth=16, params=11,162,916
+ Epoch 10: train=0.011 test=0.010 λ=1.695 σ=3.65e-01/1.28e-13
+ Epoch 20: train=0.011 test=0.010 λ=1.668 σ=3.46e-01/1.58e-14
+ Epoch 30: train=0.011 test=0.010 λ=1.632 σ=1.93e-01/2.02e-20
+ Epoch 40: train=0.009 test=0.010 λ=1.610 σ=2.17e-01/1.62e-12
+ Epoch 50: train=0.010 test=0.010 λ=1.620 σ=1.54e-01/1.56e-15
+ Epoch 60: train=0.011 test=0.010 λ=1.621 σ=5.15e-02/0.00e+00
+ Epoch 70: train=0.009 test=0.010 λ=1.606 σ=1.16e-02/0.00e+00
+ Epoch 80: train=0.009 test=0.010 λ=1.605 σ=1.80e-02/0.00e+00
+ Epoch 90: train=0.009 test=0.010 λ=1.609
+ Epoch 100: train=0.009 test=0.010 λ=1.618 σ=5.85e-04/0.00e+00
+ Epoch 110: train=0.009 test=0.010 λ=1.610 σ=5.90e-04/0.00e+00
+ Epoch 120: train=0.009 test=0.010 λ=1.608
+ Epoch 130: train=0.009 test=0.010 λ=1.603
+ Epoch 140: train=0.010 test=0.010 λ=1.606
+ Epoch 150: train=0.010 test=0.010 λ=1.596
+ Best test acc: 0.016
+
+====================================================================================================
+DEPTH SCALING RESULTS: CIFAR100
+====================================================================================================
+Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
+----------------------------------------------------------------------------------------------------
+4 0.615 0.010 -0.605 1.473 4.63e-01 8.84e-02 3.7e+08
+8 0.551 0.010 -0.541 1.546 3.64e-01 1.64e-01 2.7e+08
+12 0.118 0.010 -0.108 1.569 6.43e-01 6.98e-01 4.1e+07
+16 0.024 0.010 -0.014 1.596 5.19e-01 3.22e-01 2.7e+07
+====================================================================================================
+
+GRADIENT HEALTH ANALYSIS:
+ Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+
+
+KEY OBSERVATIONS:
+ Vanilla 4→16 layers: -0.591 accuracy change
+ Lyapunov 4→16 layers: +0.000 accuracy change
+ ✓ Lyapunov regularization enables better depth scaling!
+
+Results saved to runs/depth_scaling_stable_init/cifar100_20260102-133755
+============================================================
+Finished: Fri Jan 2 13:37:56 CST 2026
+============================================================