summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112871_weak_reg.out
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:50:59 -0600
committerYurenHao0426 <blackhao0426@gmail.com>2026-01-13 23:50:59 -0600
commit00cf667cee7ffacb144d5805fc7e0ef443f3583a (patch)
tree77d20a3adaecf96bf3aff0612bdd3b5fa1a7dc7e /runs/slurm_logs/15112871_weak_reg.out
parentc53c04aa1d6ff75cb478a9498c370baa929c74b6 (diff)
parentcd99d6b874d9d09b3bb87b8485cc787885af71f1 (diff)
Merge master into main
Diffstat (limited to 'runs/slurm_logs/15112871_weak_reg.out')
-rw-r--r--runs/slurm_logs/15112871_weak_reg.out207
1 files changed, 207 insertions, 0 deletions
diff --git a/runs/slurm_logs/15112871_weak_reg.out b/runs/slurm_logs/15112871_weak_reg.out
new file mode 100644
index 0000000..c2c26f7
--- /dev/null
+++ b/runs/slurm_logs/15112871_weak_reg.out
@@ -0,0 +1,207 @@
+============================================================
+WEAK REGULARIZATION Experiment (lambda_reg=0.01)
+Job ID: 15112871 | Node: gpub023
+Start: Thu Jan 1 12:26:50 CST 2026
+============================================================
+NVIDIA A40, 46068 MiB
+============================================================
+================================================================================
+DEPTH SCALING BENCHMARK
+================================================================================
+Dataset: cifar100
+Depths: [4, 8, 12, 16]
+Timesteps: 4
+Epochs: 150
+λ_reg: 0.01, λ_target: -0.1
+Reg type: squared, Warmup epochs: 20
+Device: cuda
+================================================================================
+
+Loading cifar100...
+Classes: 100, Input: (3, 32, 32)
+Train: 50000, Test: 10000
+
+Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
+Regularization type: squared
+Warmup epochs: 20
+Stable init: False
+
+============================================================
+Depth = 4 conv layers (4 stages × 1 blocks)
+============================================================
+ Vanilla: depth=4, params=1,756,836
+ Epoch 10: train=0.498 test=0.419 σ=9.41e-01/3.52e-08
+ Epoch 20: train=0.628 test=0.476 σ=5.85e-01/2.43e-08
+ Epoch 30: train=0.704 test=0.536 σ=4.86e-01/2.02e-08
+ Epoch 40: train=0.756 test=0.544 σ=4.13e-01/1.73e-08
+ Epoch 50: train=0.800 test=0.569 σ=3.81e-01/1.57e-08
+ Epoch 60: train=0.833 test=0.560 σ=3.37e-01/1.37e-08
+ Epoch 70: train=0.863 test=0.585 σ=3.17e-01/1.29e-08
+ Epoch 80: train=0.885 test=0.595 σ=3.04e-01/1.22e-08
+ Epoch 90: train=0.904 test=0.601 σ=2.80e-01/1.08e-08
+ Epoch 100: train=0.923 test=0.599 σ=2.68e-01/1.02e-08
+ Epoch 110: train=0.935 test=0.613 σ=2.64e-01/9.79e-09
+ Epoch 120: train=0.945 test=0.606 σ=2.43e-01/8.88e-09
+ Epoch 130: train=0.948 test=0.612 σ=2.48e-01/9.01e-09
+ Epoch 140: train=0.952 test=0.616 σ=2.24e-01/8.47e-09
+ Epoch 150: train=0.952 test=0.616 σ=2.31e-01/8.63e-09
+ Best test acc: 0.618
+ Lyapunov: depth=4, params=1,756,836
+ Epoch 10: train=0.461 test=0.286 λ=1.949 σ=9.11e-01/3.46e-08
+ Epoch 20: train=0.458 test=0.010 λ=1.465 σ=5.22e-01/2.10e-08
+ Epoch 30: train=0.513 test=0.017 λ=1.736 σ=4.33e-01/1.78e-08
+ Epoch 40: train=0.558 test=0.010 λ=1.767 σ=3.64e-01/1.59e-08
+ Epoch 50: train=0.592 test=0.010 λ=1.791 σ=3.31e-01/1.49e-08
+ Epoch 60: train=0.627 test=0.016 λ=1.766 σ=3.16e-01/1.43e-08
+ Epoch 70: train=0.658 test=0.011 λ=1.765 σ=3.10e-01/1.37e-08
+ Epoch 80: train=0.681 test=0.015 λ=1.770 σ=2.97e-01/1.33e-08
+ Epoch 90: train=0.705 test=0.012 λ=1.784 σ=2.85e-01/1.28e-08
+ Epoch 100: train=0.730 test=0.012 λ=1.784 σ=2.86e-01/1.27e-08
+ Epoch 110: train=0.747 test=0.013 λ=1.797 σ=2.87e-01/1.25e-08
+ Epoch 120: train=0.757 test=0.014 λ=1.823 σ=2.73e-01/1.21e-08
+ Epoch 130: train=0.771 test=0.013 λ=1.854 σ=2.70e-01/1.19e-08
+ Epoch 140: train=0.772 test=0.013 λ=1.873 σ=2.67e-01/1.19e-08
+ Epoch 150: train=0.777 test=0.012 λ=1.882 σ=2.76e-01/1.20e-08
+ Best test acc: 0.333
+
+============================================================
+Depth = 8 conv layers (4 stages × 2 blocks)
+============================================================
+ Vanilla: depth=8, params=4,892,196
+ Epoch 10: train=0.382 test=0.338 σ=9.40e-01/3.24e-08
+ Epoch 20: train=0.545 test=0.436 σ=4.81e-01/2.17e-08
+ Epoch 30: train=0.636 test=0.464 σ=3.88e-01/1.80e-08
+ Epoch 40: train=0.695 test=0.507 σ=3.33e-01/1.58e-08
+ Epoch 50: train=0.752 test=0.506 σ=3.07e-01/1.39e-08
+ Epoch 60: train=0.793 test=0.520 σ=2.96e-01/1.29e-08
+ Epoch 70: train=0.834 test=0.517 σ=2.68e-01/1.16e-08
+ Epoch 80: train=0.870 test=0.524 σ=2.49e-01/1.06e-08
+ Epoch 90: train=0.899 test=0.526 σ=2.41e-01/9.69e-09
+ Epoch 100: train=0.917 test=0.527 σ=2.36e-01/9.43e-09
+ Epoch 110: train=0.931 test=0.534 σ=2.25e-01/8.64e-09
+ Epoch 120: train=0.945 test=0.535 σ=2.08e-01/7.82e-09
+ Epoch 130: train=0.951 test=0.530 σ=2.02e-01/7.38e-09
+ Epoch 140: train=0.954 test=0.535 σ=2.02e-01/7.62e-09
+ Epoch 150: train=0.957 test=0.520 σ=2.01e-01/7.60e-09
+ Best test acc: 0.543
+ Lyapunov: depth=8, params=4,892,196
+ Epoch 10: train=0.046 test=0.010 λ=1.570 σ=4.09e-01/1.23e-08
+ Epoch 20: train=0.062 test=0.010 λ=1.569 σ=2.46e-01/7.84e-09
+ Epoch 30: train=0.069 test=0.010 λ=1.534 σ=1.81e-01/6.62e-09
+ Epoch 40: train=0.046 test=0.010 λ=1.562 σ=1.49e-01/4.37e-09
+ Epoch 50: train=0.057 test=0.010 λ=1.531 σ=1.53e-01/4.61e-09
+ Epoch 60: train=0.040 test=0.010 λ=1.538 σ=1.53e-01/3.35e-09
+ Epoch 70: train=0.046 test=0.010 λ=1.536 σ=1.19e-01/1.75e-09
+ Epoch 80: train=0.050 test=0.010 λ=1.534 σ=1.19e-01/2.22e-09
+ Epoch 90: train=0.062 test=0.010 λ=1.556 σ=1.18e-01/3.98e-09
+ Epoch 100: train=0.048 test=0.010 λ=1.530 σ=1.14e-01/1.46e-09
+ Epoch 110: train=0.055 test=0.010 λ=1.534 σ=1.11e-01/3.03e-09
+ Epoch 120: train=0.075 test=0.010 λ=1.539 σ=1.12e-01/4.79e-09
+ Epoch 130: train=0.079 test=0.010 λ=1.593 σ=1.20e-01/4.96e-09
+ Epoch 140: train=0.076 test=0.010 λ=1.584 σ=1.13e-01/4.96e-09
+ Epoch 150: train=0.077 test=0.010 λ=1.583 σ=1.15e-01/4.98e-09
+ Best test acc: 0.014
+
+============================================================
+Depth = 12 conv layers (4 stages × 3 blocks)
+============================================================
+ Vanilla: depth=12, params=8,027,556
+ Epoch 10: train=0.216 test=0.059 σ=7.22e-01/2.38e-08
+ Epoch 20: train=0.291 test=0.044 σ=3.35e-01/1.60e-08
+ Epoch 30: train=0.339 test=0.048 σ=2.71e-01/1.39e-08
+ Epoch 40: train=0.377 test=0.055 σ=2.37e-01/1.27e-08
+ Epoch 50: train=0.412 test=0.040 σ=2.25e-01/1.23e-08
+ Epoch 60: train=0.440 test=0.044 σ=2.24e-01/1.23e-08
+ Epoch 70: train=0.471 test=0.048 σ=2.28e-01/1.19e-08
+ Epoch 80: train=0.497 test=0.060 σ=2.25e-01/1.23e-08
+ Epoch 90: train=0.533 test=0.069 σ=2.24e-01/1.19e-08
+ Epoch 100: train=0.563 test=0.079 σ=2.24e-01/1.20e-08
+ Epoch 110: train=0.580 test=0.058 σ=2.28e-01/1.19e-08
+ Epoch 120: train=0.602 test=0.056 σ=2.30e-01/1.19e-08
+ Epoch 130: train=0.608 test=0.070 σ=2.29e-01/1.18e-08
+ Epoch 140: train=0.616 test=0.068 σ=2.27e-01/1.18e-08
+ Epoch 150: train=0.620 test=0.064 σ=2.28e-01/1.22e-08
+ Best test acc: 0.079
+ Lyapunov: depth=12, params=8,027,556
+ Epoch 10: train=0.017 test=0.010 λ=1.584 σ=2.89e-01/5.97e-12
+ Epoch 20: train=0.012 test=0.010 λ=1.566 σ=2.21e-01/1.75e-20
+ Epoch 30: train=0.012 test=0.010 λ=1.567 σ=3.65e-01/7.23e-20
+ Epoch 40: train=0.021 test=0.010 λ=1.623 σ=2.45e-01/8.70e-13
+ Epoch 50: train=0.022 test=0.010 λ=1.660 σ=1.84e-01/9.38e-13
+ Epoch 60: train=0.020 test=0.010 λ=1.695 σ=1.61e-01/5.37e-13
+ Epoch 70: train=0.019 test=0.010 λ=1.635 σ=1.40e-01/1.78e-12
+ Epoch 80: train=0.018 test=0.010 λ=1.641 σ=1.37e-01/2.32e-12
+ Epoch 90: train=0.025 test=0.010 λ=1.637 σ=1.37e-01/1.13e-09
+ Epoch 100: train=0.027 test=0.010 λ=1.684 σ=1.29e-01/1.39e-09
+ Epoch 110: train=0.022 test=0.010 λ=1.779 σ=1.13e-01/1.11e-10
+ Epoch 120: train=0.022 test=0.010 λ=1.769 σ=1.08e-01/1.12e-11
+ Epoch 130: train=0.021 test=0.010 λ=1.888 σ=9.60e-02/3.75e-12
+ Epoch 140: train=0.021 test=0.010 λ=1.788 σ=1.00e-01/9.24e-12
+ Epoch 150: train=0.022 test=0.010 λ=1.799 σ=9.76e-02/4.48e-12
+ Best test acc: 0.010
+
+============================================================
+Depth = 16 conv layers (4 stages × 4 blocks)
+============================================================
+ Vanilla: depth=16, params=11,162,916
+ Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08
+ Epoch 20: train=0.133 test=0.015 σ=2.83e-01/1.07e-08
+ Epoch 30: train=0.156 test=0.018 σ=2.23e-01/9.48e-09
+ Epoch 40: train=0.177 test=0.022 σ=2.04e-01/9.14e-09
+ Epoch 50: train=0.191 test=0.024 σ=1.78e-01/8.86e-09
+ Epoch 60: train=0.203 test=0.031 σ=1.74e-01/9.04e-09
+ Epoch 70: train=0.219 test=0.026 σ=1.62e-01/8.97e-09
+ Epoch 80: train=0.229 test=0.032 σ=1.63e-01/8.94e-09
+ Epoch 90: train=0.242 test=0.031 σ=1.60e-01/9.16e-09
+ Epoch 100: train=0.251 test=0.027 σ=1.62e-01/9.14e-09
+ Epoch 110: train=0.259 test=0.032 σ=1.58e-01/9.11e-09
+ Epoch 120: train=0.264 test=0.028 σ=1.64e-01/9.10e-09
+ Epoch 130: train=0.271 test=0.029 σ=1.61e-01/9.33e-09
+ Epoch 140: train=0.272 test=0.031 σ=1.64e-01/9.34e-09
+ Epoch 150: train=0.272 test=0.028 σ=1.66e-01/9.31e-09
+ Best test acc: 0.035
+ Lyapunov: depth=16, params=11,162,916
+ Epoch 10: train=0.014 test=0.010 λ=1.722 σ=2.76e-01/4.41e-13
+ Epoch 20: train=0.010 test=0.010 λ=1.723 σ=3.64e-01/5.20e-17
+ Epoch 30: train=0.011 test=0.010 λ=1.721 σ=8.95e-02/2.45e-17
+ Epoch 40: train=0.012 test=0.010 λ=1.787 σ=1.74e-01/5.48e-14
+ Epoch 50: train=0.014 test=0.010 λ=1.672 σ=1.88e-01/1.05e-14
+ Epoch 60: train=0.011 test=0.010 λ=1.976 σ=9.53e-02/1.33e-14
+ Epoch 70: train=0.011 test=0.010 λ=1.787 σ=9.06e-02/1.54e-14
+ Epoch 80: train=0.012 test=0.011 λ=1.825 σ=1.01e-01/4.31e-14
+ Epoch 90: train=0.010 test=0.010 λ=1.829 σ=1.48e-01/4.61e-13
+ Epoch 100: train=0.010 test=0.010 λ=1.605 σ=1.04e-01/1.42e-13
+ Epoch 110: train=0.010 test=0.010 λ=1.615 σ=1.21e-01/1.69e-14
+ Epoch 120: train=0.009 test=0.010 λ=1.613 σ=1.09e-01/1.04e-14
+ Epoch 130: train=0.010 test=0.010 λ=1.604 σ=5.06e-02/2.83e-24
+ Epoch 140: train=0.010 test=0.010 λ=1.622 σ=5.64e-02/0.00e+00
+ Epoch 150: train=0.010 test=0.010 λ=1.584 σ=2.54e-02/0.00e+00
+ Best test acc: 0.014
+
+====================================================================================================
+DEPTH SCALING RESULTS: CIFAR100
+====================================================================================================
+Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
+----------------------------------------------------------------------------------------------------
+4 0.616 0.012 -0.603 1.882 4.59e-01 6.55e-01 2.2e+08
+8 0.520 0.010 -0.510 1.583 3.83e-01 3.29e-01 2.8e+08
+12 0.064 0.010 -0.054 1.799 6.38e-01 2.04e-01 2.3e+07
+16 0.028 0.010 -0.018 1.584 5.05e-01 3.21e-01 2.1e+07
+====================================================================================================
+
+GRADIENT HEALTH ANALYSIS:
+ Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+ Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
+
+
+KEY OBSERVATIONS:
+ Vanilla 4→16 layers: -0.588 accuracy change
+ Lyapunov 4→16 layers: -0.002 accuracy change
+ ✓ Lyapunov regularization enables better depth scaling!
+
+Results saved to runs/depth_scaling_weak_reg/cifar100_20260102-133933
+============================================================
+Finished: Fri Jan 2 13:39:37 CST 2026
+============================================================