1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
|
============================================================
HINGE LOSS Lyapunov Regularization
Job ID: 14632851 | Node: gpub050
Start: Wed Dec 31 10:16:37 CST 2025
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.3, λ_target: -0.1
Reg type: hinge, Warmup epochs: 20
Device: cuda
================================================================================
Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000
Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: hinge
Warmup epochs: 20
============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
Vanilla: depth=4, params=1,756,836
Epoch 10: train=0.491 test=0.381 σ=9.60e-01/3.58e-08
Epoch 20: train=0.629 test=0.483 σ=5.82e-01/2.43e-08
Epoch 30: train=0.705 test=0.553 σ=4.88e-01/2.04e-08
Epoch 40: train=0.754 test=0.564 σ=4.23e-01/1.75e-08
Epoch 50: train=0.797 test=0.572 σ=3.67e-01/1.54e-08
Epoch 60: train=0.830 test=0.585 σ=3.46e-01/1.43e-08
Epoch 70: train=0.861 test=0.591 σ=3.17e-01/1.26e-08
Epoch 80: train=0.883 test=0.600 σ=2.94e-01/1.17e-08
Epoch 90: train=0.904 test=0.603 σ=2.84e-01/1.10e-08
Epoch 100: train=0.920 test=0.607 σ=2.68e-01/9.90e-09
Epoch 110: train=0.933 test=0.615 σ=2.64e-01/9.87e-09
Epoch 120: train=0.941 test=0.610 σ=2.47e-01/9.35e-09
Epoch 130: train=0.947 test=0.615 σ=2.40e-01/8.75e-09
Epoch 140: train=0.949 test=0.613 σ=2.43e-01/8.69e-09
Epoch 150: train=0.950 test=0.612 σ=2.42e-01/8.41e-09
Best test acc: 0.618
Lyapunov: depth=4, params=1,756,836
Epoch 10: train=0.061 test=0.010 λ=1.562 σ=5.93e-01/1.80e-08
Epoch 20: train=0.010 test=0.010 λ=1.431 σ=2.01e-01/4.78e-11
Epoch 30: train=0.009 test=0.010 λ=1.441 σ=4.39e-02/0.00e+00
Epoch 40: train=0.010 test=0.010 λ=1.460 σ=2.30e-02/0.00e+00
Epoch 50: train=0.009 test=0.010 λ=1.466 σ=2.12e-02/0.00e+00
Epoch 60: train=0.009 test=0.010 λ=1.473 σ=1.82e-02/0.00e+00
Epoch 70: train=0.010 test=0.010 λ=1.478
Epoch 80: train=0.009 test=0.010 λ=1.485
Epoch 90: train=0.009 test=0.010 λ=1.480
Epoch 100: train=0.009 test=0.010 λ=1.486
Epoch 110: train=0.009 test=0.010 λ=1.480
Epoch 120: train=0.009 test=0.010 λ=1.484
Epoch 130: train=0.009 test=0.010 λ=1.482
Epoch 140: train=0.009 test=0.010 λ=1.482
Epoch 150: train=0.010 test=0.010 λ=1.483
Best test acc: 0.086
============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
Vanilla: depth=8, params=4,892,196
Epoch 10: train=0.387 test=0.372 σ=8.41e-01/3.07e-08
Epoch 20: train=0.548 test=0.442 σ=4.66e-01/2.12e-08
Epoch 30: train=0.636 test=0.479 σ=3.74e-01/1.77e-08
Epoch 40: train=0.701 test=0.507 σ=3.24e-01/1.54e-08
Epoch 50: train=0.752 test=0.501 σ=3.10e-01/1.40e-08
Epoch 60: train=0.797 test=0.517 σ=2.80e-01/1.21e-08
Epoch 70: train=0.839 test=0.512 σ=2.65e-01/1.14e-08
Epoch 80: train=0.870 test=0.517 σ=2.50e-01/1.05e-08
Epoch 90: train=0.892 test=0.518 σ=2.40e-01/9.80e-09
Epoch 100: train=0.916 test=0.521 σ=2.29e-01/9.12e-09
Epoch 110: train=0.933 test=0.529 σ=2.20e-01/8.14e-09
Epoch 120: train=0.945 test=0.538 σ=2.10e-01/7.94e-09
Epoch 130: train=0.952 test=0.530 σ=2.06e-01/7.90e-09
Epoch 140: train=0.955 test=0.533 σ=2.04e-01/7.30e-09
Epoch 150: train=0.956 test=0.519 σ=2.03e-01/7.35e-09
Best test acc: 0.539
Lyapunov: depth=8, params=4,892,196
Epoch 10: train=0.032 test=0.010 λ=1.539 σ=3.41e-01/8.08e-09
Epoch 20: train=0.023 test=0.010 λ=1.554 σ=2.32e-01/2.95e-09
Epoch 30: train=0.010 test=0.010 λ=1.511 σ=2.01e-01/3.19e-10
Epoch 40: train=0.010 test=0.010 λ=1.525 σ=1.13e-01/7.10e-15
Epoch 50: train=0.010 test=0.010 λ=1.533 σ=7.91e-02/2.05e-32
Epoch 60: train=0.009 test=0.010 λ=1.537 σ=5.72e-02/0.00e+00
Epoch 70: train=0.010 test=0.010 λ=1.540 σ=3.47e-02/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.542
Epoch 90: train=0.009 test=0.010 λ=1.542 σ=2.50e-02/0.00e+00
Epoch 100: train=0.009 test=0.010 λ=1.545 σ=1.48e-03/0.00e+00
Epoch 110: train=0.009 test=0.010 λ=1.542
Epoch 120: train=0.009 test=0.010 λ=1.544
Epoch 130: train=0.009 test=0.010 λ=1.542
Epoch 140: train=0.010 test=0.010 λ=1.542
Epoch 150: train=0.010 test=0.010 λ=1.542
Best test acc: 0.028
============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
Vanilla: depth=12, params=8,027,556
Epoch 10: train=0.212 test=0.049 σ=6.27e-01/2.25e-08
Epoch 20: train=0.289 test=0.046 σ=3.31e-01/1.57e-08
Epoch 30: train=0.335 test=0.066 σ=2.68e-01/1.36e-08
Epoch 40: train=0.371 test=0.053 σ=2.37e-01/1.26e-08
Epoch 50: train=0.404 test=0.039 σ=2.25e-01/1.22e-08
Epoch 60: train=0.432 test=0.060 σ=2.23e-01/1.21e-08
Epoch 70: train=0.464 test=0.054 σ=2.30e-01/1.19e-08
Epoch 80: train=0.499 test=0.057 σ=2.25e-01/1.22e-08
Epoch 90: train=0.524 test=0.056 σ=2.22e-01/1.20e-08
Epoch 100: train=0.390 test=0.088 σ=2.28e-01/1.26e-08
Epoch 110: train=0.502 test=0.043 σ=2.24e-01/1.19e-08
Epoch 120: train=0.532 test=0.043 σ=2.29e-01/1.22e-08
Epoch 130: train=0.548 test=0.046 σ=2.26e-01/1.20e-08
Epoch 140: train=0.558 test=0.048 σ=2.28e-01/1.20e-08
Epoch 150: train=0.558 test=0.042 σ=2.25e-01/1.21e-08
Best test acc: 0.115
Lyapunov: depth=12, params=8,027,556
Epoch 10: train=0.010 test=0.010 λ=1.637 σ=6.08e-02/1.12e-13
Epoch 20: train=0.009 test=0.010 λ=1.549 σ=2.84e-01/2.99e-09
Epoch 30: train=0.010 test=0.010 λ=1.558 σ=1.03e-01/3.35e-15
Epoch 40: train=0.010 test=0.010 λ=1.562 σ=1.67e-01/8.35e-10
Epoch 50: train=0.010 test=0.010 λ=1.565 σ=5.77e-02/7.39e-40
Epoch 60: train=0.010 test=0.010 λ=1.567 σ=3.33e-02/4.04e-19
Epoch 70: train=0.010 test=0.010 λ=1.573 σ=5.17e-02/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.568 σ=2.22e-02/0.00e+00
Epoch 90: train=0.009 test=0.010 λ=1.571 σ=5.06e-03/0.00e+00
Epoch 100: train=0.009 test=0.010 λ=1.574
Epoch 110: train=0.009 test=0.010 λ=1.568
Epoch 120: train=0.010 test=0.010 λ=1.568
Epoch 130: train=0.010 test=0.010 λ=1.569
Epoch 140: train=0.010 test=0.010 λ=1.569
Epoch 150: train=0.010 test=0.010 λ=1.567
Best test acc: 0.013
============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
Vanilla: depth=16, params=11,162,916
Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08
Epoch 20: train=0.134 test=0.013 σ=2.85e-01/1.08e-08
Epoch 30: train=0.157 test=0.022 σ=2.23e-01/9.44e-09
Epoch 40: train=0.178 test=0.025 σ=2.01e-01/8.99e-09
Epoch 50: train=0.188 test=0.022 σ=1.84e-01/8.93e-09
Epoch 60: train=0.201 test=0.027 σ=1.72e-01/8.67e-09
Epoch 70: train=0.218 test=0.024 σ=1.61e-01/8.82e-09
Epoch 80: train=0.227 test=0.025 σ=1.64e-01/8.80e-09
Epoch 90: train=0.238 test=0.026 σ=1.57e-01/8.92e-09
Epoch 100: train=0.249 test=0.026 σ=1.61e-01/9.00e-09
Epoch 110: train=0.259 test=0.030 σ=1.58e-01/9.12e-09
Epoch 120: train=0.263 test=0.028 σ=1.63e-01/9.20e-09
Epoch 130: train=0.268 test=0.029 σ=1.59e-01/9.22e-09
Epoch 140: train=0.272 test=0.029 σ=1.62e-01/9.16e-09
Epoch 150: train=0.271 test=0.029 σ=1.66e-01/9.15e-09
Best test acc: 0.032
Lyapunov: depth=16, params=11,162,916
Epoch 10: train=0.010 test=0.010 λ=1.686 σ=4.16e-01/5.36e-09
Epoch 20: train=0.009 test=0.010 λ=1.575 σ=3.25e-01/3.25e-09
Epoch 30: train=0.009 test=0.010 λ=1.582 σ=7.66e-04/0.00e+00
Epoch 40: train=0.010 test=0.010 λ=1.584 σ=8.77e-02/3.12e-15
Epoch 50: train=0.010 test=0.010 λ=1.595 σ=4.93e-02/2.68e-33
Epoch 60: train=0.009 test=0.010 λ=1.589 σ=2.75e-02/0.00e+00
Epoch 70: train=0.010 test=0.010 λ=1.589 σ=4.38e-02/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.585 σ=2.15e-02/0.00e+00
Epoch 90: train=0.009 test=0.010 λ=1.585 σ=3.10e-02/0.00e+00
Epoch 100: train=0.010 test=0.010 λ=1.585 σ=3.20e-02/0.00e+00
Epoch 110: train=0.010 test=0.010 λ=1.584 σ=1.03e-02/0.00e+00
Epoch 120: train=0.009 test=0.010 λ=1.587
Epoch 130: train=0.009 test=0.010 λ=1.586
Epoch 140: train=0.010 test=0.010 λ=1.585
Epoch 150: train=0.010 test=0.010 λ=1.585
Best test acc: 0.012
====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
----------------------------------------------------------------------------------------------------
4 0.612 0.010 -0.602 1.483 4.68e-01 8.81e-02 4.9e+08
8 0.519 0.010 -0.509 1.542 3.79e-01 1.49e-01 1.5e+09
12 0.042 0.010 -0.032 1.567 6.45e-01 8.82e-02 3.5e+07
16 0.029 0.010 -0.019 1.585 5.06e-01 1.78e-01 3.6e+08
====================================================================================================
GRADIENT HEALTH ANALYSIS:
Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
KEY OBSERVATIONS:
Vanilla 4→16 layers: -0.583 accuracy change
Lyapunov 4→16 layers: +0.000 accuracy change
✓ Lyapunov regularization enables better depth scaling!
Results saved to runs/depth_scaling_hinge/cifar100_20260101-112306
============================================================
Finished: Thu Jan 1 11:23:09 CST 2026
============================================================
|