diff options
Diffstat (limited to 'runs/slurm_logs/14632851_hinge.out')
| -rw-r--r-- | runs/slurm_logs/14632851_hinge.out | 206 |
1 files changed, 206 insertions, 0 deletions
diff --git a/runs/slurm_logs/14632851_hinge.out b/runs/slurm_logs/14632851_hinge.out new file mode 100644 index 0000000..74e3d48 --- /dev/null +++ b/runs/slurm_logs/14632851_hinge.out @@ -0,0 +1,206 @@ +============================================================ +HINGE LOSS Lyapunov Regularization +Job ID: 14632851 | Node: gpub050 +Start: Wed Dec 31 10:16:37 CST 2025 +============================================================ +NVIDIA A40, 46068 MiB +============================================================ +================================================================================ +DEPTH SCALING BENCHMARK +================================================================================ +Dataset: cifar100 +Depths: [4, 8, 12, 16] +Timesteps: 4 +Epochs: 150 +λ_reg: 0.3, λ_target: -0.1 +Reg type: hinge, Warmup epochs: 20 +Device: cuda +================================================================================ + +Loading cifar100... +Classes: 100, Input: (3, 32, 32) +Train: 50000, Test: 10000 + +Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')] +Regularization type: hinge +Warmup epochs: 20 + +============================================================ +Depth = 4 conv layers (4 stages × 1 blocks) +============================================================ + Vanilla: depth=4, params=1,756,836 + Epoch 10: train=0.491 test=0.381 σ=9.60e-01/3.58e-08 + Epoch 20: train=0.629 test=0.483 σ=5.82e-01/2.43e-08 + Epoch 30: train=0.705 test=0.553 σ=4.88e-01/2.04e-08 + Epoch 40: train=0.754 test=0.564 σ=4.23e-01/1.75e-08 + Epoch 50: train=0.797 test=0.572 σ=3.67e-01/1.54e-08 + Epoch 60: train=0.830 test=0.585 σ=3.46e-01/1.43e-08 + Epoch 70: train=0.861 test=0.591 σ=3.17e-01/1.26e-08 + Epoch 80: train=0.883 test=0.600 σ=2.94e-01/1.17e-08 + Epoch 90: train=0.904 test=0.603 σ=2.84e-01/1.10e-08 + Epoch 100: train=0.920 test=0.607 σ=2.68e-01/9.90e-09 + Epoch 110: train=0.933 test=0.615 σ=2.64e-01/9.87e-09 + Epoch 120: train=0.941 test=0.610 σ=2.47e-01/9.35e-09 + Epoch 130: train=0.947 test=0.615 σ=2.40e-01/8.75e-09 + Epoch 140: train=0.949 test=0.613 σ=2.43e-01/8.69e-09 + Epoch 150: train=0.950 test=0.612 σ=2.42e-01/8.41e-09 + Best test acc: 0.618 + Lyapunov: depth=4, params=1,756,836 + Epoch 10: train=0.061 test=0.010 λ=1.562 σ=5.93e-01/1.80e-08 + Epoch 20: train=0.010 test=0.010 λ=1.431 σ=2.01e-01/4.78e-11 + Epoch 30: train=0.009 test=0.010 λ=1.441 σ=4.39e-02/0.00e+00 + Epoch 40: train=0.010 test=0.010 λ=1.460 σ=2.30e-02/0.00e+00 + Epoch 50: train=0.009 test=0.010 λ=1.466 σ=2.12e-02/0.00e+00 + Epoch 60: train=0.009 test=0.010 λ=1.473 σ=1.82e-02/0.00e+00 + Epoch 70: train=0.010 test=0.010 λ=1.478 + Epoch 80: train=0.009 test=0.010 λ=1.485 + Epoch 90: train=0.009 test=0.010 λ=1.480 + Epoch 100: train=0.009 test=0.010 λ=1.486 + Epoch 110: train=0.009 test=0.010 λ=1.480 + Epoch 120: train=0.009 test=0.010 λ=1.484 + Epoch 130: train=0.009 test=0.010 λ=1.482 + Epoch 140: train=0.009 test=0.010 λ=1.482 + Epoch 150: train=0.010 test=0.010 λ=1.483 + Best test acc: 0.086 + +============================================================ +Depth = 8 conv layers (4 stages × 2 blocks) +============================================================ + Vanilla: depth=8, params=4,892,196 + Epoch 10: train=0.387 test=0.372 σ=8.41e-01/3.07e-08 + Epoch 20: train=0.548 test=0.442 σ=4.66e-01/2.12e-08 + Epoch 30: train=0.636 test=0.479 σ=3.74e-01/1.77e-08 + Epoch 40: train=0.701 test=0.507 σ=3.24e-01/1.54e-08 + Epoch 50: train=0.752 test=0.501 σ=3.10e-01/1.40e-08 + Epoch 60: train=0.797 test=0.517 σ=2.80e-01/1.21e-08 + Epoch 70: train=0.839 test=0.512 σ=2.65e-01/1.14e-08 + Epoch 80: train=0.870 test=0.517 σ=2.50e-01/1.05e-08 + Epoch 90: train=0.892 test=0.518 σ=2.40e-01/9.80e-09 + Epoch 100: train=0.916 test=0.521 σ=2.29e-01/9.12e-09 + Epoch 110: train=0.933 test=0.529 σ=2.20e-01/8.14e-09 + Epoch 120: train=0.945 test=0.538 σ=2.10e-01/7.94e-09 + Epoch 130: train=0.952 test=0.530 σ=2.06e-01/7.90e-09 + Epoch 140: train=0.955 test=0.533 σ=2.04e-01/7.30e-09 + Epoch 150: train=0.956 test=0.519 σ=2.03e-01/7.35e-09 + Best test acc: 0.539 + Lyapunov: depth=8, params=4,892,196 + Epoch 10: train=0.032 test=0.010 λ=1.539 σ=3.41e-01/8.08e-09 + Epoch 20: train=0.023 test=0.010 λ=1.554 σ=2.32e-01/2.95e-09 + Epoch 30: train=0.010 test=0.010 λ=1.511 σ=2.01e-01/3.19e-10 + Epoch 40: train=0.010 test=0.010 λ=1.525 σ=1.13e-01/7.10e-15 + Epoch 50: train=0.010 test=0.010 λ=1.533 σ=7.91e-02/2.05e-32 + Epoch 60: train=0.009 test=0.010 λ=1.537 σ=5.72e-02/0.00e+00 + Epoch 70: train=0.010 test=0.010 λ=1.540 σ=3.47e-02/0.00e+00 + Epoch 80: train=0.009 test=0.010 λ=1.542 + Epoch 90: train=0.009 test=0.010 λ=1.542 σ=2.50e-02/0.00e+00 + Epoch 100: train=0.009 test=0.010 λ=1.545 σ=1.48e-03/0.00e+00 + Epoch 110: train=0.009 test=0.010 λ=1.542 + Epoch 120: train=0.009 test=0.010 λ=1.544 + Epoch 130: train=0.009 test=0.010 λ=1.542 + Epoch 140: train=0.010 test=0.010 λ=1.542 + Epoch 150: train=0.010 test=0.010 λ=1.542 + Best test acc: 0.028 + +============================================================ +Depth = 12 conv layers (4 stages × 3 blocks) +============================================================ + Vanilla: depth=12, params=8,027,556 + Epoch 10: train=0.212 test=0.049 σ=6.27e-01/2.25e-08 + Epoch 20: train=0.289 test=0.046 σ=3.31e-01/1.57e-08 + Epoch 30: train=0.335 test=0.066 σ=2.68e-01/1.36e-08 + Epoch 40: train=0.371 test=0.053 σ=2.37e-01/1.26e-08 + Epoch 50: train=0.404 test=0.039 σ=2.25e-01/1.22e-08 + Epoch 60: train=0.432 test=0.060 σ=2.23e-01/1.21e-08 + Epoch 70: train=0.464 test=0.054 σ=2.30e-01/1.19e-08 + Epoch 80: train=0.499 test=0.057 σ=2.25e-01/1.22e-08 + Epoch 90: train=0.524 test=0.056 σ=2.22e-01/1.20e-08 + Epoch 100: train=0.390 test=0.088 σ=2.28e-01/1.26e-08 + Epoch 110: train=0.502 test=0.043 σ=2.24e-01/1.19e-08 + Epoch 120: train=0.532 test=0.043 σ=2.29e-01/1.22e-08 + Epoch 130: train=0.548 test=0.046 σ=2.26e-01/1.20e-08 + Epoch 140: train=0.558 test=0.048 σ=2.28e-01/1.20e-08 + Epoch 150: train=0.558 test=0.042 σ=2.25e-01/1.21e-08 + Best test acc: 0.115 + Lyapunov: depth=12, params=8,027,556 + Epoch 10: train=0.010 test=0.010 λ=1.637 σ=6.08e-02/1.12e-13 + Epoch 20: train=0.009 test=0.010 λ=1.549 σ=2.84e-01/2.99e-09 + Epoch 30: train=0.010 test=0.010 λ=1.558 σ=1.03e-01/3.35e-15 + Epoch 40: train=0.010 test=0.010 λ=1.562 σ=1.67e-01/8.35e-10 + Epoch 50: train=0.010 test=0.010 λ=1.565 σ=5.77e-02/7.39e-40 + Epoch 60: train=0.010 test=0.010 λ=1.567 σ=3.33e-02/4.04e-19 + Epoch 70: train=0.010 test=0.010 λ=1.573 σ=5.17e-02/0.00e+00 + Epoch 80: train=0.009 test=0.010 λ=1.568 σ=2.22e-02/0.00e+00 + Epoch 90: train=0.009 test=0.010 λ=1.571 σ=5.06e-03/0.00e+00 + Epoch 100: train=0.009 test=0.010 λ=1.574 + Epoch 110: train=0.009 test=0.010 λ=1.568 + Epoch 120: train=0.010 test=0.010 λ=1.568 + Epoch 130: train=0.010 test=0.010 λ=1.569 + Epoch 140: train=0.010 test=0.010 λ=1.569 + Epoch 150: train=0.010 test=0.010 λ=1.567 + Best test acc: 0.013 + +============================================================ +Depth = 16 conv layers (4 stages × 4 blocks) +============================================================ + Vanilla: depth=16, params=11,162,916 + Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08 + Epoch 20: train=0.134 test=0.013 σ=2.85e-01/1.08e-08 + Epoch 30: train=0.157 test=0.022 σ=2.23e-01/9.44e-09 + Epoch 40: train=0.178 test=0.025 σ=2.01e-01/8.99e-09 + Epoch 50: train=0.188 test=0.022 σ=1.84e-01/8.93e-09 + Epoch 60: train=0.201 test=0.027 σ=1.72e-01/8.67e-09 + Epoch 70: train=0.218 test=0.024 σ=1.61e-01/8.82e-09 + Epoch 80: train=0.227 test=0.025 σ=1.64e-01/8.80e-09 + Epoch 90: train=0.238 test=0.026 σ=1.57e-01/8.92e-09 + Epoch 100: train=0.249 test=0.026 σ=1.61e-01/9.00e-09 + Epoch 110: train=0.259 test=0.030 σ=1.58e-01/9.12e-09 + Epoch 120: train=0.263 test=0.028 σ=1.63e-01/9.20e-09 + Epoch 130: train=0.268 test=0.029 σ=1.59e-01/9.22e-09 + Epoch 140: train=0.272 test=0.029 σ=1.62e-01/9.16e-09 + Epoch 150: train=0.271 test=0.029 σ=1.66e-01/9.15e-09 + Best test acc: 0.032 + Lyapunov: depth=16, params=11,162,916 + Epoch 10: train=0.010 test=0.010 λ=1.686 σ=4.16e-01/5.36e-09 + Epoch 20: train=0.009 test=0.010 λ=1.575 σ=3.25e-01/3.25e-09 + Epoch 30: train=0.009 test=0.010 λ=1.582 σ=7.66e-04/0.00e+00 + Epoch 40: train=0.010 test=0.010 λ=1.584 σ=8.77e-02/3.12e-15 + Epoch 50: train=0.010 test=0.010 λ=1.595 σ=4.93e-02/2.68e-33 + Epoch 60: train=0.009 test=0.010 λ=1.589 σ=2.75e-02/0.00e+00 + Epoch 70: train=0.010 test=0.010 λ=1.589 σ=4.38e-02/0.00e+00 + Epoch 80: train=0.009 test=0.010 λ=1.585 σ=2.15e-02/0.00e+00 + Epoch 90: train=0.009 test=0.010 λ=1.585 σ=3.10e-02/0.00e+00 + Epoch 100: train=0.010 test=0.010 λ=1.585 σ=3.20e-02/0.00e+00 + Epoch 110: train=0.010 test=0.010 λ=1.584 σ=1.03e-02/0.00e+00 + Epoch 120: train=0.009 test=0.010 λ=1.587 + Epoch 130: train=0.009 test=0.010 λ=1.586 + Epoch 140: train=0.010 test=0.010 λ=1.585 + Epoch 150: train=0.010 test=0.010 λ=1.585 + Best test acc: 0.012 + +==================================================================================================== +DEPTH SCALING RESULTS: CIFAR100 +==================================================================================================== +Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ +---------------------------------------------------------------------------------------------------- +4 0.612 0.010 -0.602 1.483 4.68e-01 8.81e-02 4.9e+08 +8 0.519 0.010 -0.509 1.542 3.79e-01 1.49e-01 1.5e+09 +12 0.042 0.010 -0.032 1.567 6.45e-01 8.82e-02 3.5e+07 +16 0.029 0.010 -0.019 1.585 5.06e-01 1.78e-01 3.6e+08 +==================================================================================================== + +GRADIENT HEALTH ANALYSIS: + Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + + +KEY OBSERVATIONS: + Vanilla 4→16 layers: -0.583 accuracy change + Lyapunov 4→16 layers: +0.000 accuracy change + ✓ Lyapunov regularization enables better depth scaling! + +Results saved to runs/depth_scaling_hinge/cifar100_20260101-112306 +============================================================ +Finished: Thu Jan 1 11:23:09 CST 2026 +============================================================ |
