diff options
Diffstat (limited to 'runs/slurm_logs/15112874_extreme.out')
| -rw-r--r-- | runs/slurm_logs/15112874_extreme.out | 207 |
1 files changed, 207 insertions, 0 deletions
diff --git a/runs/slurm_logs/15112874_extreme.out b/runs/slurm_logs/15112874_extreme.out new file mode 100644 index 0000000..cea4a1b --- /dev/null +++ b/runs/slurm_logs/15112874_extreme.out @@ -0,0 +1,207 @@ +============================================================ +EXTREME-ONLY PENALTY Experiment (lambda > 2.0) +Job ID: 15112874 | Node: gpub032 +Start: Thu Jan 1 12:26:50 CST 2026 +============================================================ +NVIDIA A40, 46068 MiB +============================================================ +================================================================================ +DEPTH SCALING BENCHMARK +================================================================================ +Dataset: cifar100 +Depths: [4, 8, 12, 16] +Timesteps: 4 +Epochs: 150 +λ_reg: 0.3, λ_target: -0.1 +Reg type: extreme, Warmup epochs: 10 +Device: cuda +================================================================================ + +Loading cifar100... +Classes: 100, Input: (3, 32, 32) +Train: 50000, Test: 10000 + +Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')] +Regularization type: extreme +Warmup epochs: 10 +Stable init: False + +============================================================ +Depth = 4 conv layers (4 stages × 1 blocks) +============================================================ + Vanilla: depth=4, params=1,756,836 + Epoch 10: train=0.498 test=0.436 σ=9.55e-01/3.56e-08 + Epoch 20: train=0.629 test=0.527 σ=5.80e-01/2.40e-08 + Epoch 30: train=0.701 test=0.546 σ=4.83e-01/2.01e-08 + Epoch 40: train=0.756 test=0.566 σ=4.24e-01/1.76e-08 + Epoch 50: train=0.799 test=0.566 σ=3.67e-01/1.52e-08 + Epoch 60: train=0.832 test=0.580 σ=3.40e-01/1.41e-08 + Epoch 70: train=0.858 test=0.563 σ=3.15e-01/1.28e-08 + Epoch 80: train=0.883 test=0.584 σ=2.98e-01/1.18e-08 + Epoch 90: train=0.906 test=0.595 σ=2.79e-01/1.07e-08 + Epoch 100: train=0.920 test=0.597 σ=2.63e-01/1.02e-08 + Epoch 110: train=0.932 test=0.612 σ=2.50e-01/9.41e-09 + Epoch 120: train=0.941 test=0.610 σ=2.47e-01/8.94e-09 + Epoch 130: train=0.947 test=0.614 σ=2.45e-01/9.05e-09 + Epoch 140: train=0.949 test=0.614 σ=2.41e-01/8.80e-09 + Epoch 150: train=0.953 test=0.615 σ=2.35e-01/8.52e-09 + Best test acc: 0.619 + Lyapunov: depth=4, params=1,756,836 + Epoch 10: train=0.415 test=0.120 λ=1.974 σ=8.95e-01/3.40e-08 + Epoch 20: train=0.551 test=0.414 λ=1.943 σ=5.70e-01/2.44e-08 + Epoch 30: train=0.631 test=0.479 λ=1.922 σ=4.64e-01/2.05e-08 + Epoch 40: train=0.692 test=0.394 λ=1.908 σ=4.18e-01/1.83e-08 + Epoch 50: train=0.739 test=0.418 λ=1.909 σ=3.71e-01/1.60e-08 + Epoch 60: train=0.780 test=0.446 λ=1.917 σ=3.56e-01/1.52e-08 + Epoch 70: train=0.815 test=0.458 λ=1.914 σ=3.28e-01/1.36e-08 + Epoch 80: train=0.845 test=0.480 λ=1.923 σ=3.10e-01/1.32e-08 + Epoch 90: train=0.868 test=0.486 λ=1.919 σ=2.93e-01/1.20e-08 + Epoch 100: train=0.887 test=0.480 λ=1.923 σ=2.79e-01/1.15e-08 + Epoch 110: train=0.902 test=0.489 λ=1.929 σ=2.75e-01/1.08e-08 + Epoch 120: train=0.913 test=0.467 λ=1.926 σ=2.66e-01/1.05e-08 + Epoch 130: train=0.920 test=0.479 λ=1.931 σ=2.59e-01/1.03e-08 + Epoch 140: train=0.924 test=0.483 λ=1.928 σ=2.51e-01/9.81e-09 + Epoch 150: train=0.925 test=0.475 λ=1.937 σ=2.47e-01/9.86e-09 + Best test acc: 0.508 + +============================================================ +Depth = 8 conv layers (4 stages × 2 blocks) +============================================================ + Vanilla: depth=8, params=4,892,196 + Epoch 10: train=0.390 test=0.350 σ=8.10e-01/3.04e-08 + Epoch 20: train=0.546 test=0.435 σ=4.82e-01/2.15e-08 + Epoch 30: train=0.632 test=0.473 σ=3.79e-01/1.78e-08 + Epoch 40: train=0.697 test=0.513 σ=3.29e-01/1.55e-08 + Epoch 50: train=0.752 test=0.512 σ=3.12e-01/1.42e-08 + Epoch 60: train=0.795 test=0.520 σ=2.97e-01/1.31e-08 + Epoch 70: train=0.836 test=0.526 σ=2.73e-01/1.18e-08 + Epoch 80: train=0.869 test=0.533 σ=2.55e-01/1.10e-08 + Epoch 90: train=0.897 test=0.525 σ=2.44e-01/9.77e-09 + Epoch 100: train=0.916 test=0.530 σ=2.35e-01/9.36e-09 + Epoch 110: train=0.933 test=0.539 σ=2.27e-01/8.60e-09 + Epoch 120: train=0.943 test=0.537 σ=2.18e-01/8.20e-09 + Epoch 130: train=0.952 test=0.541 σ=2.05e-01/7.82e-09 + Epoch 140: train=0.956 test=0.538 σ=2.12e-01/7.90e-09 + Epoch 150: train=0.956 test=0.534 σ=1.94e-01/7.48e-09 + Best test acc: 0.547 + Lyapunov: depth=8, params=4,892,196 + Epoch 10: train=0.078 test=0.016 λ=1.777 σ=5.17e-01/1.78e-08 + Epoch 20: train=0.121 test=0.016 λ=1.693 σ=2.63e-01/1.25e-08 + Epoch 30: train=0.143 test=0.020 λ=1.696 σ=2.23e-01/1.14e-08 + Epoch 40: train=0.147 test=0.013 λ=1.657 σ=1.86e-01/1.06e-08 + Epoch 50: train=0.129 test=0.012 λ=1.659 σ=1.68e-01/8.87e-09 + Epoch 60: train=0.137 test=0.011 λ=1.625 σ=1.54e-01/8.80e-09 + Epoch 70: train=0.082 test=0.009 λ=1.589 σ=1.32e-01/6.54e-09 + Epoch 80: train=0.127 test=0.011 λ=1.590 σ=1.42e-01/7.63e-09 + Epoch 90: train=0.142 test=0.009 λ=1.609 σ=1.45e-01/8.25e-09 + Epoch 100: train=0.147 test=0.012 λ=1.590 σ=1.41e-01/8.09e-09 + Epoch 110: train=0.152 test=0.010 λ=1.598 σ=1.43e-01/8.06e-09 + Epoch 120: train=0.156 test=0.010 λ=1.592 σ=1.40e-01/8.22e-09 + Epoch 130: train=0.162 test=0.010 λ=1.589 σ=1.43e-01/8.35e-09 + Epoch 140: train=0.163 test=0.010 λ=1.584 σ=1.40e-01/8.47e-09 + Epoch 150: train=0.163 test=0.010 λ=1.583 σ=1.38e-01/8.27e-09 + Best test acc: 0.025 + +============================================================ +Depth = 12 conv layers (4 stages × 3 blocks) +============================================================ + Vanilla: depth=12, params=8,027,556 + Epoch 10: train=0.214 test=0.048 σ=6.28e-01/2.26e-08 + Epoch 20: train=0.293 test=0.067 σ=3.29e-01/1.56e-08 + Epoch 30: train=0.342 test=0.086 σ=2.69e-01/1.36e-08 + Epoch 40: train=0.383 test=0.097 σ=2.47e-01/1.30e-08 + Epoch 50: train=0.420 test=0.080 σ=2.44e-01/1.30e-08 + Epoch 60: train=0.451 test=0.116 σ=2.32e-01/1.26e-08 + Epoch 70: train=0.484 test=0.102 σ=2.35e-01/1.23e-08 + Epoch 80: train=0.518 test=0.114 σ=2.31e-01/1.26e-08 + Epoch 90: train=0.547 test=0.118 σ=2.30e-01/1.23e-08 + Epoch 100: train=0.576 test=0.118 σ=2.30e-01/1.23e-08 + Epoch 110: train=0.598 test=0.114 σ=2.37e-01/1.23e-08 + Epoch 120: train=0.619 test=0.110 σ=2.33e-01/1.22e-08 + Epoch 130: train=0.629 test=0.121 σ=2.33e-01/1.23e-08 + Epoch 140: train=0.637 test=0.116 σ=2.31e-01/1.20e-08 + Epoch 150: train=0.638 test=0.116 σ=2.34e-01/1.22e-08 + Best test acc: 0.136 + Lyapunov: depth=12, params=8,027,556 + Epoch 10: train=0.031 test=0.012 λ=1.794 σ=4.39e-01/1.15e-08 + Epoch 20: train=0.028 test=0.009 λ=1.711 σ=1.99e-01/4.47e-09 + Epoch 30: train=0.028 test=0.010 λ=1.684 σ=1.36e-01/2.73e-09 + Epoch 40: train=0.023 test=0.006 λ=1.653 σ=1.19e-01/4.52e-12 + Epoch 50: train=0.037 test=0.010 λ=1.668 σ=1.13e-01/2.71e-09 + Epoch 60: train=0.029 test=0.010 λ=1.646 σ=1.11e-01/8.32e-12 + Epoch 70: train=0.021 test=0.010 λ=1.727 σ=1.30e-01/5.03e-13 + Epoch 80: train=0.024 test=0.010 λ=1.749 σ=1.01e-01/9.40e-13 + Epoch 90: train=0.022 test=0.010 λ=1.665 σ=8.78e-02/9.05e-13 + Epoch 100: train=0.022 test=0.010 λ=1.676 σ=7.62e-02/9.14e-13 + Epoch 110: train=0.025 test=0.010 λ=1.660 σ=8.45e-02/1.40e-12 + Epoch 120: train=0.024 test=0.010 λ=1.627 σ=8.26e-02/1.30e-12 + Epoch 130: train=0.024 test=0.010 λ=1.663 σ=8.21e-02/7.96e-13 + Epoch 140: train=0.028 test=0.010 λ=1.644 σ=9.22e-02/3.67e-12 + Epoch 150: train=0.029 test=0.010 λ=1.647 σ=9.05e-02/2.90e-12 + Best test acc: 0.014 + +============================================================ +Depth = 16 conv layers (4 stages × 4 blocks) +============================================================ + Vanilla: depth=16, params=11,162,916 + Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08 + Epoch 20: train=0.135 test=0.014 σ=2.84e-01/1.06e-08 + Epoch 30: train=0.157 test=0.017 σ=2.21e-01/9.39e-09 + Epoch 40: train=0.174 test=0.021 σ=2.00e-01/9.09e-09 + Epoch 50: train=0.190 test=0.021 σ=1.78e-01/8.83e-09 + Epoch 60: train=0.201 test=0.023 σ=1.72e-01/8.80e-09 + Epoch 70: train=0.214 test=0.026 σ=1.62e-01/8.89e-09 + Epoch 80: train=0.228 test=0.025 σ=1.63e-01/8.94e-09 + Epoch 90: train=0.238 test=0.027 σ=1.58e-01/9.07e-09 + Epoch 100: train=0.249 test=0.025 σ=1.61e-01/9.11e-09 + Epoch 110: train=0.256 test=0.029 σ=1.59e-01/9.10e-09 + Epoch 120: train=0.261 test=0.027 σ=1.63e-01/9.11e-09 + Epoch 130: train=0.270 test=0.027 σ=1.60e-01/9.22e-09 + Epoch 140: train=0.270 test=0.027 σ=1.63e-01/9.32e-09 + Epoch 150: train=0.272 test=0.027 σ=1.65e-01/9.27e-09 + Best test acc: 0.033 + Lyapunov: depth=16, params=11,162,916 + Epoch 10: train=0.019 test=0.010 λ=1.891 σ=4.08e-01/7.96e-09 + Epoch 20: train=0.018 test=0.010 λ=1.853 σ=1.49e-01/4.73e-11 + Epoch 30: train=0.016 test=0.010 λ=2.038 σ=1.09e-01/1.08e-12 + Epoch 40: train=0.016 test=0.007 λ=1.845 σ=9.66e-02/4.94e-14 + Epoch 50: train=0.012 test=0.010 λ=1.807 σ=1.11e-01/3.35e-27 + Epoch 60: train=0.013 test=0.009 λ=1.801 σ=1.01e-01/2.59e-28 + Epoch 70: train=0.013 test=0.010 λ=2.064 σ=1.36e-01/9.48e-16 + Epoch 80: train=0.020 test=0.010 λ=2.055 σ=1.11e-01/7.37e-14 + Epoch 90: train=0.017 test=0.010 λ=1.959 σ=1.20e-01/1.56e-13 + Epoch 100: train=0.022 test=0.010 λ=1.887 σ=1.01e-01/4.19e-13 + Epoch 110: train=0.019 test=0.010 λ=1.881 σ=9.46e-02/4.77e-13 + Epoch 120: train=0.018 test=0.010 λ=1.889 σ=8.10e-02/9.50e-14 + Epoch 130: train=0.014 test=0.010 λ=1.892 σ=7.23e-02/1.42e-14 + Epoch 140: train=0.015 test=0.010 λ=1.898 σ=7.02e-02/1.63e-14 + Epoch 150: train=0.015 test=0.010 λ=1.899 σ=7.15e-02/1.18e-14 + Best test acc: 0.012 + +==================================================================================================== +DEPTH SCALING RESULTS: CIFAR100 +==================================================================================================== +Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ +---------------------------------------------------------------------------------------------------- +4 0.615 0.475 -0.140 1.937 4.58e-01 5.38e-01 5.0e+08 +8 0.534 0.010 -0.524 1.583 3.88e-01 4.57e-01 3.6e+08 +12 0.116 0.010 -0.106 1.647 6.51e-01 2.14e-01 5.8e+08 +16 0.027 0.010 -0.017 1.899 5.07e-01 1.38e-01 3.8e+07 +==================================================================================================== + +GRADIENT HEALTH ANALYSIS: + Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) + + +KEY OBSERVATIONS: + Vanilla 4→16 layers: -0.588 accuracy change + Lyapunov 4→16 layers: -0.465 accuracy change + ✓ Lyapunov regularization enables better depth scaling! + +Results saved to runs/depth_scaling_extreme/cifar100_20260102-133536 +============================================================ +Finished: Fri Jan 2 13:35:39 CST 2026 +============================================================ |
