summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112874_extreme.out
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:49:05 -0600
committerYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:49:05 -0600
commitcd99d6b874d9d09b3bb87b8485cc787885af71f1 (patch)
tree59a233959932ca0e4f12f196275e07fcf443b33f /runs/slurm_logs/15112874_extreme.out
init commit
Diffstat (limited to 'runs/slurm_logs/15112874_extreme.out')
-rw-r--r--runs/slurm_logs/15112874_extreme.out207
1 files changed, 207 insertions, 0 deletions
diff --git a/runs/slurm_logs/15112874_extreme.out b/runs/slurm_logs/15112874_extreme.out
new file mode 100644
index 0000000..cea4a1b
--- /dev/null
+++ b/runs/slurm_logs/15112874_extreme.out
@@ -0,0 +1,207 @@
+============================================================
+EXTREME-ONLY PENALTY Experiment (lambda > 2.0)
+Job ID: 15112874 | Node: gpub032
+Start: Thu Jan 1 12:26:50 CST 2026
+============================================================
+NVIDIA A40, 46068 MiB
+============================================================
+================================================================================
+DEPTH SCALING BENCHMARK
+================================================================================
+Dataset: cifar100
+Depths: [4, 8, 12, 16]
+Timesteps: 4
+Epochs: 150
+λ_reg: 0.3, λ_target: -0.1
+Reg type: extreme, Warmup epochs: 10
+Device: cuda
+================================================================================
+
+Loading cifar100...
+Classes: 100, Input: (3, 32, 32)
+Train: 50000, Test: 10000
+
+Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
+Regularization type: extreme
+Warmup epochs: 10
+Stable init: False
+
+============================================================
+Depth = 4 conv layers (4 stages × 1 blocks)
+============================================================
+ Vanilla: depth=4, params=1,756,836
+ Epoch 10: train=0.498 test=0.436 σ=9.55e-01/3.56e-08
+ Epoch 20: train=0.629 test=0.527 σ=5.80e-01/2.40e-08
+ Epoch 30: train=0.701 test=0.546 σ=4.83e-01/2.01e-08
+ Epoch 40: train=0.756 test=0.566 σ=4.24e-01/1.76e-08
+ Epoch 50: train=0.799 test=0.566 σ=3.67e-01/1.52e-08
+ Epoch 60: train=0.832 test=0.580 σ=3.40e-01/1.41e-08
+ Epoch 70: train=0.858 test=0.563 σ=3.15e-01/1.28e-08
+ Epoch 80: train=0.883 test=0.584 σ=2.98e-01/1.18e-08
+ Epoch 90: train=0.906 test=0.595 σ=2.79e-01/1.07e-08
+ Epoch 100: train=0.920 test=0.597 σ=2.63e-01/1.02e-08
+ Epoch 110: train=0.932 test=0.612 σ=2.50e-01/9.41e-09
+ Epoch 120: train=0.941 test=0.610 σ=2.47e-01/8.94e-09
+ Epoch 130: train=0.947 test=0.614 σ=2.45e-01/9.05e-09
+ Epoch 140: train=0.949 test=0.614 σ=2.41e-01/8.80e-09
+ Epoch 150: train=0.953 test=0.615 σ=2.35e-01/8.52e-09
+ Best test acc: 0.619
+ Lyapunov: depth=4, params=1,756,836
+ Epoch 10: train=0.415 test=0.120 λ=1.974 σ=8.95e-01/3.40e-08
+ Epoch 20: train=0.551 test=0.414 λ=1.943 σ=5.70e-01/2.44e-08
+ Epoch 30: train=0.631 test=0.479 λ=1.922 σ=4.64e-01/2.05e-08
+ Epoch 40: train=0.692 test=0.394 λ=1.908 σ=4.18e-01/1.83e-08
+ Epoch 50: train=0.739 test=0.418 λ=1.909 σ=3.71e-01/1.60e-08
+ Epoch 60: train=0.780 test=0.446 λ=1.917 σ=3.56e-01/1.52e-08
+ Epoch 70: train=0.815 test=0.458 λ=1.914 σ=3.28e-01/1.36e-08
+ Epoch 80: train=0.845 test=0.480 λ=1.923 σ=3.10e-01/1.32e-08
+ Epoch 90: train=0.868 test=0.486 λ=1.919 σ=2.93e-01/1.20e-08
+ Epoch 100: train=0.887 test=0.480 λ=1.923 σ=2.79e-01/1.15e-08
+ Epoch 110: train=0.902 test=0.489 λ=1.929 σ=2.75e-01/1.08e-08
+ Epoch 120: train=0.913 test=0.467 λ=1.926 σ=2.66e-01/1.05e-08
+ Epoch 130: train=0.920 test=0.479 λ=1.931 σ=2.59e-01/1.03e-08
+ Epoch 140: train=0.924 test=0.483 λ=1.928 σ=2.51e-01/9.81e-09
+ Epoch 150: train=0.925 test=0.475 λ=1.937 σ=2.47e-01/9.86e-09
+ Best test acc: 0.508
+
+============================================================
+Depth = 8 conv layers (4 stages × 2 blocks)
+============================================================
+ Vanilla: depth=8, params=4,892,196
+ Epoch 10: train=0.390 test=0.350 σ=8.10e-01/3.04e-08
+ Epoch 20: train=0.546 test=0.435 σ=4.82e-01/2.15e-08
+ Epoch 30: train=0.632 test=0.473 σ=3.79e-01/1.78e-08
+ Epoch 40: train=0.697 test=0.513 σ=3.29e-01/1.55e-08
+ Epoch 50: train=0.752 test=0.512 σ=3.12e-01/1.42e-08
+ Epoch 60: train=0.795 test=0.520 σ=2.97e-01/1.31e-08
+ Epoch 70: train=0.836 test=0.526 σ=2.73e-01/1.18e-08
+ Epoch 80: train=0.869 test=0.533 σ=2.55e-01/1.10e-08
+ Epoch 90: train=0.897 test=0.525 σ=2.44e-01/9.77e-09
+ Epoch 100: train=0.916 test=0.530 σ=2.35e-01/9.36e-09
+ Epoch 110: train=0.933 test=0.539 σ=2.27e-01/8.60e-09
+ Epoch 120: train=0.943 test=0.537 σ=2.18e-01/8.20e-09
+ Epoch 130: train=0.952 test=0.541 σ=2.05e-01/7.82e-09
+ Epoch 140: train=0.956 test=0.538 σ=2.12e-01/7.90e-09
+ Epoch 150: train=0.956 test=0.534 σ=1.94e-01/7.48e-09
+ Best test acc: 0.547
+ Lyapunov: depth=8, params=4,892,196
+ Epoch 10: train=0.078 test=0.016 λ=1.777 σ=5.17e-01/1.78e-08
+ Epoch 20: train=0.121 test=0.016 λ=1.693 σ=2.63e-01/1.25e-08
+ Epoch 30: train=0.143 test=0.020 λ=1.696 σ=2.23e-01/1.14e-08
+ Epoch 40: train=0.147 test=0.013 λ=1.657 σ=1.86e-01/1.06e-08
+ Epoch 50: train=0.129 test=0.012 λ=1.659 σ=1.68e-01/8.87e-09
+ Epoch 60: train=0.137 test=0.011 λ=1.625 σ=1.54e-01/8.80e-09
+ Epoch 70: train=0.082 test=0.009 λ=1.589 σ=1.32e-01/6.54e-09
+ Epoch 80: train=0.127 test=0.011 λ=1.590 σ=1.42e-01/7.63e-09
+ Epoch 90: train=0.142 test=0.009 λ=1.609 σ=1.45e-01/8.25e-09
+ Epoch 100: train=0.147 test=0.012 λ=1.590 σ=1.41e-01/8.09e-09
+ Epoch 110: train=0.152 test=0.010 λ=1.598 σ=1.43e-01/8.06e-09
+ Epoch 120: train=0.156 test=0.010 λ=1.592 σ=1.40e-01/8.22e-09
+ Epoch 130: train=0.162 test=0.010 λ=1.589 σ=1.43e-01/8.35e-09
+ Epoch 140: train=0.163 test=0.010 λ=1.584 σ=1.40e-01/8.47e-09
+ Epoch 150: train=0.163 test=0.010 λ=1.583 σ=1.38e-01/8.27e-09
+ Best test acc: 0.025
+
+============================================================
+Depth = 12 conv layers (4 stages × 3 blocks)
+============================================================
+ Vanilla: depth=12, params=8,027,556
+ Epoch 10: train=0.214 test=0.048 σ=6.28e-01/2.26e-08
+ Epoch 20: train=0.293 test=0.067 σ=3.29e-01/1.56e-08
+ Epoch 30: train=0.342 test=0.086 σ=2.69e-01/1.36e-08
+ Epoch 40: train=0.383 test=0.097 σ=2.47e-01/1.30e-08
+ Epoch 50: train=0.420 test=0.080 σ=2.44e-01/1.30e-08
+ Epoch 60: train=0.451 test=0.116 σ=2.32e-01/1.26e-08
+ Epoch 70: train=0.484 test=0.102 σ=2.35e-01/1.23e-08
+ Epoch 80: train=0.518 test=0.114 σ=2.31e-01/1.26e-08
+ Epoch 90: train=0.547 test=0.118 σ=2.30e-01/1.23e-08
+ Epoch 100: train=0.576 test=0.118 σ=2.30e-01/1.23e-08
+ Epoch 110: train=0.598 test=0.114 σ=2.37e-01/1.23e-08
+ Epoch 120: train=0.619 test=0.110 σ=2.33e-01/1.22e-08
+ Epoch 130: train=0.629 test=0.121 σ=2.33e-01/1.23e-08
+ Epoch 140: train=0.637 test=0.116 σ=2.31e-01/1.20e-08
+ Epoch 150: train=0.638 test=0.116 σ=2.34e-01/1.22e-08
+ Best test acc: 0.136
+ Lyapunov: depth=12, params=8,027,556
+ Epoch 10: train=0.031 test=0.012 λ=1.794 σ=4.39e-01/1.15e-08
+ Epoch 20: train=0.028 test=0.009 λ=1.711 σ=1.99e-01/4.47e-09
+ Epoch 30: train=0.028 test=0.010 λ=1.684 σ=1.36e-01/2.73e-09
+ Epoch 40: train=0.023 test=0.006 λ=1.653 σ=1.19e-01/4.52e-12
+ Epoch 50: train=0.037 test=0.010 λ=1.668 σ=1.13e-01/2.71e-09
+ Epoch 60: train=0.029 test=0.010 λ=1.646 σ=1.11e-01/8.32e-12
+ Epoch 70: train=0.021 test=0.010 λ=1.727 σ=1.30e-01/5.03e-13
+ Epoch 80: train=0.024 test=0.010 λ=1.749 σ=1.01e-01/9.40e-13
+ Epoch 90: train=0.022 test=0.010 λ=1.665 σ=8.78e-02/9.05e-13
+ Epoch 100: train=0.022 test=0.010 λ=1.676 σ=7.62e-02/9.14e-13
+ Epoch 110: train=0.025 test=0.010 λ=1.660 σ=8.45e-02/1.40e-12
+ Epoch 120: train=0.024 test=0.010 λ=1.627 σ=8.26e-02/1.30e-12
+ Epoch 130: train=0.024 test=0.010 λ=1.663 σ=8.21e-02/7.96e-13
+ Epoch 140: train=0.028 test=0.010 λ=1.644 σ=9.22e-02/3.67e-12
+ Epoch 150: train=0.029 test=0.010 λ=1.647 σ=9.05e-02/2.90e-12
+ Best test acc: 0.014
+
+============================================================
+Depth = 16 conv layers (4 stages × 4 blocks)
+============================================================
+ Vanilla: depth=16, params=11,162,916
+ Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08
+ Epoch 20: train=0.135 test=0.014 σ=2.84e-01/1.06e-08
+ Epoch 30: train=0.157 test=0.017 σ=2.21e-01/9.39e-09
+ Epoch 40: train=0.174 test=0.021 σ=2.00e-01/9.09e-09
+ Epoch 50: train=0.190 test=0.021 σ=1.78e-01/8.83e-09
+ Epoch 60: train=0.201 test=0.023 σ=1.72e-01/8.80e-09
+ Epoch 70: train=0.214 test=0.026 σ=1.62e-01/8.89e-09
+ Epoch 80: train=0.228 test=0.025 σ=1.63e-01/8.94e-09
+ Epoch 90: train=0.238 test=0.027 σ=1.58e-01/9.07e-09
+ Epoch 100: train=0.249 test=0.025 σ=1.61e-01/9.11e-09
+ Epoch 110: train=0.256 test=0.029 σ=1.59e-01/9.10e-09
+ Epoch 120: train=0.261 test=0.027 σ=1.63e-01/9.11e-09
+ Epoch 130: train=0.270 test=0.027 σ=1.60e-01/9.22e-09
+ Epoch 140: train=0.270 test=0.027 σ=1.63e-01/9.32e-09
+ Epoch 150: train=0.272 test=0.027 σ=1.65e-01/9.27e-09
+ Best test acc: 0.033
+ Lyapunov: depth=16, params=11,162,916
+ Epoch 10: train=0.019 test=0.010 λ=1.891 σ=4.08e-01/7.96e-09
+ Epoch 20: train=0.018 test=0.010 λ=1.853 σ=1.49e-01/4.73e-11
+ Epoch 30: train=0.016 test=0.010 λ=2.038 σ=1.09e-01/1.08e-12
+ Epoch 40: train=0.016 test=0.007 λ=1.845 σ=9.66e-02/4.94e-14
+ Epoch 50: train=0.012 test=0.010 λ=1.807 σ=1.11e-01/3.35e-27
+ Epoch 60: train=0.013 test=0.009 λ=1.801 σ=1.01e-01/2.59e-28
+ Epoch 70: train=0.013 test=0.010 λ=2.064 σ=1.36e-01/9.48e-16
+ Epoch 80: train=0.020 test=0.010 λ=2.055 σ=1.11e-01/7.37e-14
+ Epoch 90: train=0.017 test=0.010 λ=1.959 σ=1.20e-01/1.56e-13
+ Epoch 100: train=0.022 test=0.010 λ=1.887 σ=1.01e-01/4.19e-13
+ Epoch 110: train=0.019 test=0.010 λ=1.881 σ=9.46e-02/4.77e-13
+ Epoch 120: train=0.018 test=0.010 λ=1.889 σ=8.10e-02/9.50e-14
+ Epoch 130: train=0.014 test=0.010 λ=1.892 σ=7.23e-02/1.42e-14
+ Epoch 140: train=0.015 test=0.010 λ=1.898 σ=7.02e-02/1.63e-14
+ Epoch 150: train=0.015 test=0.010 λ=1.899 σ=7.15e-02/1.18e-14
+ Best test acc: 0.012
+
+====================================================================================================
+DEPTH SCALING RESULTS: CIFAR100
+====================================================================================================
+Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
+----------------------------------------------------------------------------------------------------
+4 0.615 0.475 -0.140 1.937 4.58e-01 5.38e-01 5.0e+08
+8 0.534 0.010 -0.524 1.583 3.88e-01 4.57e-01 3.6e+08
+12 0.116 0.010 -0.106 1.647 6.51e-01 2.14e-01 5.8e+08
+16 0.027 0.010 -0.017 1.899 5.07e-01 1.38e-01 3.8e+07
+====================================================================================================
+
+GRADIENT HEALTH ANALYSIS:
+ Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+
+
+KEY OBSERVATIONS:
+ Vanilla 4→16 layers: -0.588 accuracy change
+ Lyapunov 4→16 layers: -0.465 accuracy change
+ ✓ Lyapunov regularization enables better depth scaling!
+
+Results saved to runs/depth_scaling_extreme/cifar100_20260102-133536
+============================================================
+Finished: Fri Jan 2 13:35:39 CST 2026
+============================================================