1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
|
============================================================
SMARTER TARGET Experiment (lambda_target=1.0)
Job ID: 15112872 | Node: gpub074
Start: Thu Jan 1 12:26:50 CST 2026
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.1, λ_target: 1.0
Reg type: squared, Warmup epochs: 20
Device: cuda
================================================================================
Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000
Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: squared
Warmup epochs: 20
Stable init: False
============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
Vanilla: depth=4, params=1,756,836
Epoch 10: train=0.498 test=0.436 σ=9.42e-01/3.51e-08
Epoch 20: train=0.629 test=0.499 σ=5.87e-01/2.46e-08
Epoch 30: train=0.701 test=0.550 σ=4.77e-01/2.01e-08
Epoch 40: train=0.753 test=0.545 σ=4.17e-01/1.75e-08
Epoch 50: train=0.798 test=0.577 σ=3.69e-01/1.53e-08
Epoch 60: train=0.830 test=0.582 σ=3.33e-01/1.39e-08
Epoch 70: train=0.862 test=0.580 σ=3.29e-01/1.26e-08
Epoch 80: train=0.884 test=0.578 σ=3.07e-01/1.21e-08
Epoch 90: train=0.907 test=0.598 σ=2.79e-01/1.08e-08
Epoch 100: train=0.921 test=0.608 σ=2.72e-01/1.06e-08
Epoch 110: train=0.935 test=0.608 σ=2.54e-01/9.36e-09
Epoch 120: train=0.943 test=0.608 σ=2.46e-01/9.12e-09
Epoch 130: train=0.949 test=0.615 σ=2.34e-01/8.83e-09
Epoch 140: train=0.951 test=0.610 σ=2.32e-01/8.64e-09
Epoch 150: train=0.954 test=0.614 σ=2.32e-01/8.63e-09
Best test acc: 0.615
Lyapunov: depth=4, params=1,756,836
Epoch 10: train=0.285 test=0.024 λ=1.483 σ=7.62e-01/2.92e-08
Epoch 20: train=0.344 test=0.014 λ=1.560 σ=4.80e-01/2.01e-08
Epoch 30: train=0.348 test=0.012 λ=1.679 σ=3.85e-01/1.69e-08
Epoch 40: train=0.389 test=0.013 λ=1.635 σ=3.33e-01/1.54e-08
Epoch 50: train=0.431 test=0.011 λ=1.635 σ=3.11e-01/1.45e-08
Epoch 60: train=0.461 test=0.016 λ=1.622 σ=2.95e-01/1.44e-08
Epoch 70: train=0.478 test=0.014 λ=1.660 σ=2.90e-01/1.40e-08
Epoch 80: train=0.499 test=0.013 λ=1.657 σ=2.82e-01/1.40e-08
Epoch 90: train=0.522 test=0.013 λ=1.663 σ=2.78e-01/1.36e-08
Epoch 100: train=0.537 test=0.015 λ=1.678 σ=2.78e-01/1.36e-08
Epoch 110: train=0.550 test=0.014 λ=1.684 σ=2.92e-01/1.38e-08
Epoch 120: train=0.559 test=0.016 λ=1.704 σ=2.90e-01/1.42e-08
Epoch 130: train=0.570 test=0.018 λ=1.709 σ=2.79e-01/1.36e-08
Epoch 140: train=0.571 test=0.017 λ=1.865 σ=2.83e-01/1.37e-08
Epoch 150: train=0.576 test=0.017 λ=1.816 σ=2.82e-01/1.37e-08
Best test acc: 0.212
============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
Vanilla: depth=8, params=4,892,196
Epoch 10: train=0.388 test=0.351 σ=8.72e-01/3.16e-08
Epoch 20: train=0.544 test=0.427 σ=4.73e-01/2.16e-08
Epoch 30: train=0.630 test=0.466 σ=3.81e-01/1.79e-08
Epoch 40: train=0.698 test=0.502 σ=3.22e-01/1.53e-08
Epoch 50: train=0.747 test=0.519 σ=3.08e-01/1.42e-08
Epoch 60: train=0.799 test=0.515 σ=2.87e-01/1.29e-08
Epoch 70: train=0.836 test=0.524 σ=2.76e-01/1.18e-08
Epoch 80: train=0.869 test=0.534 σ=2.44e-01/1.05e-08
Epoch 90: train=0.898 test=0.528 σ=2.39e-01/9.52e-09
Epoch 100: train=0.918 test=0.527 σ=2.29e-01/8.96e-09
Epoch 110: train=0.933 test=0.542 σ=2.26e-01/8.58e-09
Epoch 120: train=0.943 test=0.542 σ=2.09e-01/7.88e-09
Epoch 130: train=0.951 test=0.545 σ=1.97e-01/7.80e-09
Epoch 140: train=0.955 test=0.542 σ=2.06e-01/7.61e-09
Epoch 150: train=0.954 test=0.535 σ=1.94e-01/7.46e-09
Best test acc: 0.550
Lyapunov: depth=8, params=4,892,196
Epoch 10: train=0.035 test=0.010 λ=1.583 σ=3.25e-01/9.17e-09
Epoch 20: train=0.049 test=0.010 λ=1.574 σ=2.46e-01/6.77e-09
Epoch 30: train=0.061 test=0.010 λ=1.571 σ=2.03e-01/5.87e-09
Epoch 40: train=0.033 test=0.010 λ=1.544 σ=1.80e-01/2.89e-09
Epoch 50: train=0.030 test=0.010 λ=1.550 σ=1.59e-01/9.89e-10
Epoch 60: train=0.030 test=0.010 λ=1.567 σ=1.39e-01/5.26e-10
Epoch 70: train=0.029 test=0.010 λ=1.571 σ=1.16e-01/1.53e-10
Epoch 80: train=0.041 test=0.010 λ=1.646 σ=1.41e-01/3.16e-09
Epoch 90: train=0.036 test=0.010 λ=1.808 σ=1.37e-01/2.76e-09
Epoch 100: train=0.031 test=0.010 λ=1.940 σ=1.71e-01/2.97e-09
Epoch 110: train=0.047 test=0.010 λ=1.976 σ=1.42e-01/3.22e-09
Epoch 120: train=0.047 test=0.008 λ=1.993 σ=1.26e-01/3.43e-09
Epoch 130: train=0.046 test=0.010 λ=2.057 σ=1.50e-01/3.50e-09
Epoch 140: train=0.026 test=0.010 λ=2.014 σ=2.43e-01/3.40e-09
Epoch 150: train=0.031 test=0.010 λ=2.334 σ=1.30e-01/4.11e-09
Best test acc: 0.024
============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
Vanilla: depth=12, params=8,027,556
Epoch 10: train=0.214 test=0.070 σ=5.69e-01/2.19e-08
Epoch 20: train=0.289 test=0.057 σ=3.32e-01/1.63e-08
Epoch 30: train=0.340 test=0.107 σ=2.62e-01/1.36e-08
Epoch 40: train=0.374 test=0.083 σ=2.35e-01/1.28e-08
Epoch 50: train=0.410 test=0.073 σ=2.28e-01/1.25e-08
Epoch 60: train=0.436 test=0.101 σ=2.23e-01/1.23e-08
Epoch 70: train=0.473 test=0.087 σ=2.33e-01/1.22e-08
Epoch 80: train=0.505 test=0.083 σ=2.21e-01/1.22e-08
Epoch 90: train=0.534 test=0.090 σ=2.24e-01/1.21e-08
Epoch 100: train=0.561 test=0.096 σ=2.29e-01/1.23e-08
Epoch 110: train=0.584 test=0.074 σ=2.30e-01/1.21e-08
Epoch 120: train=0.602 test=0.088 σ=2.35e-01/1.22e-08
Epoch 130: train=0.609 test=0.093 σ=2.30e-01/1.20e-08
Epoch 140: train=0.620 test=0.094 σ=2.27e-01/1.19e-08
Epoch 150: train=0.624 test=0.086 σ=2.31e-01/1.22e-08
Best test acc: 0.109
Lyapunov: depth=12, params=8,027,556
Epoch 10: train=0.013 test=0.010 λ=1.639 σ=3.64e-01/1.06e-12
Epoch 20: train=0.015 test=0.010 λ=1.598 σ=2.98e-01/1.14e-12
Epoch 30: train=0.019 test=0.010 λ=1.630 σ=3.30e-01/3.22e-12
Epoch 40: train=0.021 test=0.010 λ=1.592 σ=1.82e-01/2.19e-12
Epoch 50: train=0.020 test=0.010 λ=1.658 σ=1.51e-01/2.96e-12
Epoch 60: train=0.015 test=0.010 λ=1.616 σ=1.03e-01/2.55e-13
Epoch 70: train=0.018 test=0.010 λ=1.617 σ=1.18e-01/4.53e-13
Epoch 80: train=0.020 test=0.010 λ=1.636 σ=1.22e-01/4.69e-12
Epoch 90: train=0.021 test=0.010 λ=1.593 σ=1.05e-01/7.58e-12
Epoch 100: train=0.026 test=0.010 λ=1.593 σ=1.16e-01/7.00e-10
Epoch 110: train=0.021 test=0.010 λ=1.590 σ=9.46e-02/4.97e-12
Epoch 120: train=0.024 test=0.010 λ=1.740 σ=9.83e-02/3.89e-11
Epoch 130: train=0.020 test=0.010 λ=1.901 σ=1.09e-01/9.81e-11
Epoch 140: train=0.027 test=0.010 λ=1.972 σ=1.21e-01/1.96e-09
Epoch 150: train=0.019 test=0.010 λ=2.112 σ=6.82e-02/1.40e-11
Best test acc: 0.019
============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
Vanilla: depth=16, params=11,162,916
Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08
Epoch 20: train=0.135 test=0.014 σ=2.84e-01/1.06e-08
Epoch 30: train=0.157 test=0.017 σ=2.21e-01/9.39e-09
Epoch 40: train=0.177 test=0.019 σ=1.93e-01/8.97e-09
Epoch 50: train=0.191 test=0.024 σ=1.81e-01/9.00e-09
Epoch 60: train=0.202 test=0.024 σ=1.74e-01/8.89e-09
Epoch 70: train=0.215 test=0.026 σ=1.66e-01/8.97e-09
Epoch 80: train=0.227 test=0.030 σ=1.65e-01/8.89e-09
Epoch 90: train=0.240 test=0.026 σ=1.55e-01/8.93e-09
Epoch 100: train=0.248 test=0.028 σ=1.61e-01/9.09e-09
Epoch 110: train=0.254 test=0.031 σ=1.59e-01/9.18e-09
Epoch 120: train=0.262 test=0.033 σ=1.64e-01/9.30e-09
Epoch 130: train=0.267 test=0.033 σ=1.58e-01/9.22e-09
Epoch 140: train=0.269 test=0.030 σ=1.61e-01/9.32e-09
Epoch 150: train=0.269 test=0.030 σ=1.64e-01/9.32e-09
Best test acc: 0.036
Lyapunov: depth=16, params=11,162,916
Epoch 10: train=0.014 test=0.010 λ=1.688 σ=3.64e-01/6.52e-13
Epoch 20: train=0.010 test=0.010 λ=1.803 σ=3.56e-01/1.53e-13
Epoch 30: train=0.012 test=0.002 λ=1.664 σ=4.44e-01/4.25e-14
Epoch 40: train=0.018 test=0.010 λ=1.733 σ=2.04e-01/9.43e-13
Epoch 50: train=0.010 test=0.010 λ=1.598 σ=1.55e-01/0.00e+00
Epoch 60: train=0.010 test=0.010 λ=1.594 σ=4.41e-02/0.00e+00
Epoch 70: train=0.009 test=0.010 λ=1.601 σ=7.50e-02/1.52e-14
Epoch 80: train=0.012 test=0.010 λ=2.219 σ=5.78e-02/1.54e-42
Epoch 90: train=0.016 test=0.010 λ=2.122 σ=9.91e-02/9.49e-15
Epoch 100: train=0.017 test=0.010 λ=2.163 σ=1.05e-01/7.08e-13
Epoch 110: train=0.020 test=0.010 λ=2.124 σ=1.10e-01/1.17e-12
Epoch 120: train=0.010 test=0.010 λ=2.181 σ=6.50e-02/6.09e-15
Epoch 130: train=0.010 test=0.010 λ=2.755 σ=6.77e-09/1.45e-20
Epoch 140: train=0.016 test=0.010 λ=2.217 σ=1.13e-01/5.35e-14
Epoch 150: train=0.018 test=0.010 λ=2.219 σ=1.21e-01/7.28e-14
Best test acc: 0.012
====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
----------------------------------------------------------------------------------------------------
4 0.614 0.017 -0.597 1.816 4.52e-01 7.31e-01 1.1e+09
8 0.535 0.010 -0.525 2.334 3.87e-01 3.44e-01 8.0e+08
12 0.086 0.010 -0.076 2.112 6.47e-01 2.28e+00 4.9e+07
16 0.030 0.010 -0.020 2.219 5.07e-01 9.41e-01 2.1e+07
====================================================================================================
GRADIENT HEALTH ANALYSIS:
Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
KEY OBSERVATIONS:
Vanilla 4→16 layers: -0.584 accuracy change
Lyapunov 4→16 layers: -0.007 accuracy change
✓ Lyapunov regularization enables better depth scaling!
Results saved to runs/depth_scaling_target1/cifar100_20260102-133339
============================================================
Finished: Fri Jan 2 13:33:43 CST 2026
============================================================
|