1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
|
============================================================
CIFAR-10 Depth Scaling Benchmark
Job ID: 14363509 | Node: gpub039
Start: Mon Dec 29 00:06:15 CST 2025
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar10
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 100
λ_reg: 0.3, λ_target: -0.1
Device: cuda
================================================================================
Loading cifar10...
Classes: 10, Input: (3, 32, 32)
Train: 50000, Test: 10000
Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
Vanilla: depth=4, params=1,572,426
Epoch 10: train=0.766 test=0.684 σ=1.77e+00/1.12e-07
Epoch 20: train=0.845 test=0.769 σ=8.13e-01/6.93e-08
Epoch 30: train=0.885 test=0.795 σ=5.04e-01/5.04e-08
Epoch 40: train=0.913 test=0.830 σ=3.73e-01/4.29e-08
Epoch 50: train=0.936 test=0.851 σ=3.15e-01/3.97e-08
Epoch 60: train=0.952 test=0.854 σ=2.80e-01/3.70e-08
Epoch 70: train=0.967 test=0.867 σ=2.29e-01/3.56e-08
Epoch 80: train=0.973 test=0.869 σ=2.24e-01/3.51e-08
Epoch 90: train=0.979 test=0.870 σ=2.04e-01/3.44e-08
Epoch 100: train=0.979 test=0.872 σ=1.99e-01/3.30e-08
Best test acc: 0.873
Lyapunov: depth=4, params=1,572,426
Epoch 10: train=0.098 test=0.100 λ=1.928 σ=7.83e-02/8.44e-10
Epoch 20: train=0.097 test=0.100 λ=1.919
Epoch 30: train=0.098 test=0.100 λ=1.918 σ=3.72e-03/2.28e-14
Epoch 40: train=0.098 test=0.100 λ=1.922
Epoch 50: train=0.098 test=0.100 λ=1.923
Epoch 60: train=0.097 test=0.100 λ=1.920
Epoch 70: train=0.097 test=0.100 λ=1.920
Epoch 80: train=0.098 test=0.100 λ=1.920
Epoch 90: train=0.098 test=0.100 λ=1.920
Epoch 100: train=0.100 test=0.100 λ=1.920
Best test acc: 0.275
============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
Vanilla: depth=8, params=4,707,786
Epoch 10: train=0.709 test=0.682 σ=1.97e+00/1.14e-07
Epoch 20: train=0.815 test=0.752 σ=5.91e-01/5.57e-08
Epoch 30: train=0.867 test=0.811 σ=3.68e-01/4.03e-08
Epoch 40: train=0.900 test=0.815 σ=2.64e-01/3.40e-08
Epoch 50: train=0.925 test=0.840 σ=2.22e-01/2.97e-08
Epoch 60: train=0.948 test=0.834 σ=2.00e-01/2.76e-08
Epoch 70: train=0.964 test=0.831 σ=1.76e-01/2.67e-08
Epoch 80: train=0.974 test=0.839 σ=1.47e-01/2.56e-08
Epoch 90: train=0.980 test=0.848 σ=1.40e-01/2.45e-08
Epoch 100: train=0.981 test=0.850 σ=1.47e-01/2.44e-08
Best test acc: 0.851
Lyapunov: depth=8, params=4,707,786
Epoch 10: train=0.128 test=0.100 λ=2.363 σ=1.04e-01/3.70e-09
Epoch 20: train=0.097 test=0.100 λ=2.262 σ=4.82e-03/7.30e-36
Epoch 30: train=0.099 test=0.100 λ=2.268
Epoch 40: train=0.099 test=0.100 λ=2.260
Epoch 50: train=0.096 test=0.100 λ=2.261
Epoch 60: train=0.097 test=0.100 λ=2.263
Epoch 70: train=0.099 test=0.100 λ=2.262
Epoch 80: train=0.096 test=0.100 λ=2.261
Epoch 90: train=0.097 test=0.100 λ=2.260
Epoch 100: train=0.100 test=0.100 λ=2.261
Best test acc: 0.212
============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
Vanilla: depth=12, params=7,843,146
Epoch 10: train=0.508 test=0.243 σ=9.35e-01/6.03e-08
Epoch 20: train=0.586 test=0.276 σ=4.45e-01/3.35e-08
Epoch 30: train=0.639 test=0.353 σ=3.15e-01/2.46e-08
Epoch 40: train=0.672 test=0.365 σ=2.83e-01/2.38e-08
Epoch 50: train=0.699 test=0.459 σ=2.73e-01/2.35e-08
Epoch 60: train=0.727 test=0.490 σ=2.61e-01/2.37e-08
Epoch 70: train=0.747 test=0.499 σ=2.44e-01/2.33e-08
Epoch 80: train=0.764 test=0.492 σ=2.47e-01/2.36e-08
Epoch 90: train=0.774 test=0.462 σ=2.50e-01/2.30e-08
Epoch 100: train=0.775 test=0.490 σ=2.31e-01/2.28e-08
Best test acc: 0.499
Lyapunov: depth=12, params=7,843,146
Epoch 10: train=0.110 test=0.100 λ=3.021 σ=2.68e-01/5.79e-09
Epoch 20: train=0.098 test=0.100 λ=2.464
Epoch 30: train=0.097 test=0.100 λ=2.484
Epoch 40: train=0.098 test=0.100 λ=2.470
Epoch 50: train=0.096 test=0.100 λ=2.463
Epoch 60: train=0.098 test=0.100 λ=2.480
Epoch 70: train=0.097 test=0.100 λ=2.468
Epoch 80: train=0.096 test=0.100 λ=2.463
Epoch 90: train=0.099 test=0.100 λ=2.467
Epoch 100: train=0.100 test=0.100 λ=2.463
Best test acc: 0.108
============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
Vanilla: depth=16, params=10,978,506
Epoch 10: train=0.308 test=0.100 σ=3.38e+00/1.23e-07
Epoch 20: train=0.367 test=0.107 σ=2.47e+00/9.15e-08
Epoch 30: train=0.402 test=0.105 σ=2.20e+00/8.56e-08
Epoch 40: train=0.427 test=0.103 σ=1.93e+00/7.65e-08
Epoch 50: train=0.448 test=0.108 σ=1.57e+00/6.91e-08
Epoch 60: train=0.461 test=0.105 σ=1.43e+00/6.00e-08
Epoch 70: train=0.473 test=0.105 σ=1.17e+00/5.32e-08
Epoch 80: train=0.482 test=0.106 σ=1.18e+00/5.38e-08
Epoch 90: train=0.487 test=0.109 σ=1.18e+00/5.38e-08
Epoch 100: train=0.487 test=0.106 σ=1.09e+00/5.24e-08
Best test acc: 0.120
Lyapunov: depth=16, params=10,978,506
Epoch 10: train=0.120 test=0.100 λ=2.810 σ=7.74e-01/1.07e-08
Epoch 20: train=0.104 test=0.100 λ=2.748 σ=5.73e-02/5.65e-12
Epoch 30: train=0.098 test=0.100 λ=2.608 σ=2.81e-03/0.00e+00
Epoch 40: train=0.098 test=0.100 λ=2.605
Epoch 50: train=0.098 test=0.100 λ=2.609
Epoch 60: train=0.097 test=0.100 λ=2.618
Epoch 70: train=0.096 test=0.100 λ=2.615
Epoch 80: train=0.099 test=0.100 λ=2.606
Epoch 90: train=0.096 test=0.100 λ=2.604
Epoch 100: train=0.100 test=0.100 λ=2.602
Best test acc: 0.113
====================================================================================================
DEPTH SCALING RESULTS: CIFAR10
====================================================================================================
Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
----------------------------------------------------------------------------------------------------
4 0.872 0.100 -0.772 1.920 2.75e-01 8.22e-02 6.1e+06
8 0.850 0.100 -0.750 2.261 1.94e-01 8.25e-02 6.1e+06
12 0.490 0.100 -0.390 2.463 4.09e-01 8.02e-02 1.0e+07
16 0.106 0.100 -0.006 2.602 1.28e+00 8.22e-02 2.1e+07
====================================================================================================
GRADIENT HEALTH ANALYSIS:
Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
KEY OBSERVATIONS:
Vanilla 4→16 layers: -0.767 accuracy change
Lyapunov 4→16 layers: +0.000 accuracy change
✓ Lyapunov regularization enables better depth scaling!
Results saved to runs/depth_scaling/cifar10_20251229-160504
============================================================
Finished: Mon Dec 29 16:05:07 CST 2025
============================================================
|