============================================================ STABLE INITIALIZATION Experiment Job ID: 15112873 | Node: gpub011 Start: Thu Jan 1 12:26:50 CST 2026 ============================================================ NVIDIA A40, 46068 MiB ============================================================ ================================================================================ DEPTH SCALING BENCHMARK ================================================================================ Dataset: cifar100 Depths: [4, 8, 12, 16] Timesteps: 4 Epochs: 150 λ_reg: 0.1, λ_target: -0.1 Reg type: squared, Warmup epochs: 20 Device: cuda ================================================================================ Loading cifar100... Classes: 100, Input: (3, 32, 32) Train: 50000, Test: 10000 Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')] Regularization type: squared Warmup epochs: 20 Stable init: True ============================================================ Depth = 4 conv layers (4 stages × 1 blocks) ============================================================ Vanilla: depth=4, params=1,756,836 Epoch 10: train=0.516 test=0.431 σ=9.10e-01/3.50e-08 Epoch 20: train=0.640 test=0.517 σ=5.84e-01/2.47e-08 Epoch 30: train=0.712 test=0.558 σ=4.82e-01/2.04e-08 Epoch 40: train=0.761 test=0.558 σ=4.07e-01/1.72e-08 Epoch 50: train=0.800 test=0.577 σ=3.76e-01/1.54e-08 Epoch 60: train=0.837 test=0.581 σ=3.34e-01/1.38e-08 Epoch 70: train=0.864 test=0.579 σ=3.25e-01/1.29e-08 Epoch 80: train=0.888 test=0.592 σ=2.91e-01/1.17e-08 Epoch 90: train=0.907 test=0.602 σ=2.89e-01/1.10e-08 Epoch 100: train=0.921 test=0.604 σ=2.71e-01/1.05e-08 Epoch 110: train=0.935 test=0.606 σ=2.64e-01/9.89e-09 Epoch 120: train=0.943 test=0.617 σ=2.46e-01/9.41e-09 Epoch 130: train=0.950 test=0.615 σ=2.45e-01/8.86e-09 Epoch 140: train=0.951 test=0.615 σ=2.29e-01/8.67e-09 Epoch 150: train=0.953 test=0.615 σ=2.36e-01/8.51e-09 Best test acc: 0.620 Lyapunov: depth=4, params=1,756,836 Epoch 10: train=0.195 test=0.014 λ=1.549 σ=7.07e-01/2.57e-08 Epoch 20: train=0.135 test=0.012 λ=1.570 σ=4.14e-01/1.49e-08 Epoch 30: train=0.057 test=0.010 λ=1.488 σ=2.46e-01/6.99e-09 Epoch 40: train=0.067 test=0.010 λ=1.481 σ=2.03e-01/6.20e-09 Epoch 50: train=0.048 test=0.010 λ=1.877 σ=1.80e-01/4.00e-09 Epoch 60: train=0.009 test=0.010 λ=1.462 σ=4.58e-02/0.00e+00 Epoch 70: train=0.010 test=0.010 λ=1.467 σ=3.57e-02/0.00e+00 Epoch 80: train=0.010 test=0.010 λ=1.471 σ=1.33e-02/0.00e+00 Epoch 90: train=0.009 test=0.010 λ=1.471 σ=4.82e-03/0.00e+00 Epoch 100: train=0.009 test=0.010 λ=1.471 σ=1.18e-03/0.00e+00 Epoch 110: train=0.009 test=0.010 λ=1.471 σ=4.32e-03/0.00e+00 Epoch 120: train=0.009 test=0.010 λ=1.472 Epoch 130: train=0.010 test=0.010 λ=1.472 Epoch 140: train=0.010 test=0.010 λ=1.471 Epoch 150: train=0.010 test=0.010 λ=1.473 Best test acc: 0.106 ============================================================ Depth = 8 conv layers (4 stages × 2 blocks) ============================================================ Vanilla: depth=8, params=4,892,196 Epoch 10: train=0.451 test=0.402 σ=7.31e-01/2.99e-08 Epoch 20: train=0.587 test=0.471 σ=4.69e-01/2.12e-08 Epoch 30: train=0.666 test=0.493 σ=3.81e-01/1.75e-08 Epoch 40: train=0.728 test=0.505 σ=3.27e-01/1.53e-08 Epoch 50: train=0.774 test=0.533 σ=3.18e-01/1.40e-08 Epoch 60: train=0.812 test=0.521 σ=2.93e-01/1.28e-08 Epoch 70: train=0.852 test=0.547 σ=2.81e-01/1.17e-08 Epoch 80: train=0.884 test=0.531 σ=2.48e-01/1.02e-08 Epoch 90: train=0.906 test=0.537 σ=2.35e-01/9.46e-09 Epoch 100: train=0.927 test=0.553 σ=2.24e-01/8.84e-09 Epoch 110: train=0.941 test=0.552 σ=2.09e-01/8.04e-09 Epoch 120: train=0.951 test=0.553 σ=2.09e-01/7.55e-09 Epoch 130: train=0.959 test=0.553 σ=2.10e-01/7.39e-09 Epoch 140: train=0.959 test=0.561 σ=1.95e-01/7.19e-09 Epoch 150: train=0.961 test=0.551 σ=1.94e-01/6.97e-09 Best test acc: 0.564 Lyapunov: depth=8, params=4,892,196 Epoch 10: train=0.046 test=0.010 λ=1.543 σ=3.90e-01/9.92e-09 Epoch 20: train=0.038 test=0.010 λ=1.533 σ=2.42e-01/4.88e-09 Epoch 30: train=0.038 test=0.010 λ=1.623 σ=1.93e-01/3.39e-09 Epoch 40: train=0.028 test=0.010 λ=1.706 σ=1.66e-01/2.06e-09 Epoch 50: train=0.009 test=0.010 λ=1.532 σ=7.89e-02/1.54e-17 Epoch 60: train=0.010 test=0.010 λ=1.540 σ=4.28e-02/5.11e-27 Epoch 70: train=0.009 test=0.010 λ=1.544 σ=4.22e-02/0.00e+00 Epoch 80: train=0.010 test=0.010 λ=1.548 σ=3.81e-02/0.00e+00 Epoch 90: train=0.011 test=0.010 λ=1.554 σ=3.03e-02/0.00e+00 Epoch 100: train=0.010 test=0.010 λ=1.549 σ=9.40e-03/0.00e+00 Epoch 110: train=0.010 test=0.010 λ=1.549 σ=5.91e-03/0.00e+00 Epoch 120: train=0.010 test=0.010 λ=1.548 σ=3.83e-03/0.00e+00 Epoch 130: train=0.010 test=0.010 λ=1.549 σ=7.81e-03/0.00e+00 Epoch 140: train=0.010 test=0.010 λ=1.549 σ=1.37e-02/0.00e+00 Epoch 150: train=0.010 test=0.010 λ=1.546 σ=8.69e-03/0.00e+00 Best test acc: 0.021 ============================================================ Depth = 12 conv layers (4 stages × 3 blocks) ============================================================ Vanilla: depth=12, params=8,027,556 Epoch 10: train=0.253 test=0.046 σ=4.96e-01/2.03e-08 Epoch 20: train=0.322 test=0.044 σ=3.35e-01/1.58e-08 Epoch 30: train=0.364 test=0.054 σ=2.77e-01/1.38e-08 Epoch 40: train=0.404 test=0.046 σ=2.49e-01/1.30e-08 Epoch 50: train=0.439 test=0.062 σ=2.30e-01/1.24e-08 Epoch 60: train=0.469 test=0.040 σ=2.30e-01/1.24e-08 Epoch 70: train=0.498 test=0.054 σ=2.35e-01/1.21e-08 Epoch 80: train=0.532 test=0.058 σ=2.26e-01/1.20e-08 Epoch 90: train=0.565 test=0.072 σ=2.26e-01/1.18e-08 Epoch 100: train=0.276 test=0.099 σ=1.92e-01/1.10e-08 Epoch 110: train=0.409 test=0.123 σ=2.13e-01/1.20e-08 Epoch 120: train=0.470 test=0.124 σ=2.27e-01/1.20e-08 Epoch 130: train=0.495 test=0.146 σ=2.19e-01/1.22e-08 Epoch 140: train=0.510 test=0.138 σ=2.15e-01/1.17e-08 Epoch 150: train=0.512 test=0.118 σ=2.18e-01/1.17e-08 Best test acc: 0.146 Lyapunov: depth=12, params=8,027,556 Epoch 10: train=0.011 test=0.010 λ=1.563 σ=5.46e-01/7.17e-09 Epoch 20: train=0.010 test=0.010 λ=1.556 σ=8.74e-02/8.70e-15 Epoch 30: train=0.010 test=0.010 λ=1.554 σ=9.58e-02/3.05e-15 Epoch 40: train=0.009 test=0.010 λ=1.566 σ=6.06e-02/2.31e-34 Epoch 50: train=0.010 test=0.010 λ=1.566 σ=3.46e-02/0.00e+00 Epoch 60: train=0.009 test=0.010 λ=1.573 σ=4.50e-02/0.00e+00 Epoch 70: train=0.010 test=0.010 λ=1.572 σ=1.34e-02/0.00e+00 Epoch 80: train=0.009 test=0.010 λ=1.575 σ=6.32e-04/0.00e+00 Epoch 90: train=0.009 test=0.010 λ=1.576 σ=5.51e-02/0.00e+00 Epoch 100: train=0.010 test=0.010 λ=1.579 σ=2.74e-02/0.00e+00 Epoch 110: train=0.009 test=0.010 λ=1.575 σ=2.56e-02/0.00e+00 Epoch 120: train=0.010 test=0.010 λ=1.576 σ=3.61e-02/0.00e+00 Epoch 130: train=0.010 test=0.010 λ=1.576 Epoch 140: train=0.010 test=0.010 λ=1.574 σ=5.40e-03/0.00e+00 Epoch 150: train=0.010 test=0.010 λ=1.569 Best test acc: 0.011 ============================================================ Depth = 16 conv layers (4 stages × 4 blocks) ============================================================ Vanilla: depth=16, params=11,162,916 Epoch 10: train=0.120 test=0.020 σ=4.06e-01/1.45e-08 Epoch 20: train=0.158 test=0.011 σ=2.71e-01/1.13e-08 Epoch 30: train=0.182 test=0.016 σ=2.16e-01/1.00e-08 Epoch 40: train=0.203 test=0.029 σ=2.01e-01/9.74e-09 Epoch 50: train=0.220 test=0.025 σ=1.83e-01/9.59e-09 Epoch 60: train=0.237 test=0.025 σ=1.78e-01/9.64e-09 Epoch 70: train=0.250 test=0.029 σ=1.67e-01/9.64e-09 Epoch 80: train=0.259 test=0.026 σ=1.65e-01/9.31e-09 Epoch 90: train=0.273 test=0.022 σ=1.63e-01/9.65e-09 Epoch 100: train=0.229 test=0.019 σ=1.52e-01/9.12e-09 Epoch 110: train=0.256 test=0.024 σ=1.54e-01/9.41e-09 Epoch 120: train=0.266 test=0.025 σ=1.60e-01/9.49e-09 Epoch 130: train=0.277 test=0.025 σ=1.57e-01/9.48e-09 Epoch 140: train=0.283 test=0.025 σ=1.61e-01/9.66e-09 Epoch 150: train=0.283 test=0.024 σ=1.63e-01/9.63e-09 Best test acc: 0.036 Lyapunov: depth=16, params=11,162,916 Epoch 10: train=0.011 test=0.010 λ=1.695 σ=3.65e-01/1.28e-13 Epoch 20: train=0.011 test=0.010 λ=1.668 σ=3.46e-01/1.58e-14 Epoch 30: train=0.011 test=0.010 λ=1.632 σ=1.93e-01/2.02e-20 Epoch 40: train=0.009 test=0.010 λ=1.610 σ=2.17e-01/1.62e-12 Epoch 50: train=0.010 test=0.010 λ=1.620 σ=1.54e-01/1.56e-15 Epoch 60: train=0.011 test=0.010 λ=1.621 σ=5.15e-02/0.00e+00 Epoch 70: train=0.009 test=0.010 λ=1.606 σ=1.16e-02/0.00e+00 Epoch 80: train=0.009 test=0.010 λ=1.605 σ=1.80e-02/0.00e+00 Epoch 90: train=0.009 test=0.010 λ=1.609 Epoch 100: train=0.009 test=0.010 λ=1.618 σ=5.85e-04/0.00e+00 Epoch 110: train=0.009 test=0.010 λ=1.610 σ=5.90e-04/0.00e+00 Epoch 120: train=0.009 test=0.010 λ=1.608 Epoch 130: train=0.009 test=0.010 λ=1.603 Epoch 140: train=0.010 test=0.010 λ=1.606 Epoch 150: train=0.010 test=0.010 λ=1.596 Best test acc: 0.016 ==================================================================================================== DEPTH SCALING RESULTS: CIFAR100 ==================================================================================================== Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ ---------------------------------------------------------------------------------------------------- 4 0.615 0.010 -0.605 1.473 4.63e-01 8.84e-02 3.7e+08 8 0.551 0.010 -0.541 1.546 3.64e-01 1.64e-01 2.7e+08 12 0.118 0.010 -0.108 1.569 6.43e-01 6.98e-01 4.1e+07 16 0.024 0.010 -0.014 1.596 5.19e-01 3.22e-01 2.7e+07 ==================================================================================================== GRADIENT HEALTH ANALYSIS: Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6) KEY OBSERVATIONS: Vanilla 4→16 layers: -0.591 accuracy change Lyapunov 4→16 layers: +0.000 accuracy change ✓ Lyapunov regularization enables better depth scaling! Results saved to runs/depth_scaling_stable_init/cifar100_20260102-133755 ============================================================ Finished: Fri Jan 2 13:37:56 CST 2026 ============================================================