1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
|
============================================================
ASYMMETRIC Lyapunov Regularization
Job ID: 14632852 | Node: gpub079
Start: Wed Dec 31 10:16:37 CST 2025
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.3, λ_target: -0.1
Reg type: asymmetric, Warmup epochs: 20
Device: cuda
================================================================================
Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000
Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: asymmetric
Warmup epochs: 20
============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
Vanilla: depth=4, params=1,756,836
Epoch 10: train=0.498 test=0.438 σ=9.52e-01/3.57e-08
Epoch 20: train=0.631 test=0.506 σ=5.87e-01/2.45e-08
Epoch 30: train=0.706 test=0.562 σ=4.78e-01/1.99e-08
Epoch 40: train=0.756 test=0.554 σ=4.24e-01/1.76e-08
Epoch 50: train=0.800 test=0.583 σ=3.77e-01/1.55e-08
Epoch 60: train=0.833 test=0.593 σ=3.51e-01/1.39e-08
Epoch 70: train=0.863 test=0.591 σ=3.18e-01/1.26e-08
Epoch 80: train=0.883 test=0.596 σ=3.04e-01/1.21e-08
Epoch 90: train=0.907 test=0.601 σ=2.91e-01/1.12e-08
Epoch 100: train=0.922 test=0.615 σ=2.69e-01/1.04e-08
Epoch 110: train=0.933 test=0.614 σ=2.62e-01/9.69e-09
Epoch 120: train=0.942 test=0.615 σ=2.47e-01/9.10e-09
Epoch 130: train=0.950 test=0.620 σ=2.44e-01/8.88e-09
Epoch 140: train=0.951 test=0.623 σ=2.46e-01/8.88e-09
Epoch 150: train=0.952 test=0.618 σ=2.22e-01/8.47e-09
Best test acc: 0.624
Lyapunov: depth=4, params=1,756,836
Epoch 10: train=0.036 test=0.010 λ=1.488 σ=6.35e-01/1.45e-08
Epoch 20: train=0.009 test=0.010 λ=1.425 σ=1.08e-01/3.34e-15
Epoch 30: train=0.009 test=0.010 λ=1.451 σ=4.78e-02/9.92e-21
Epoch 40: train=0.010 test=0.010 λ=1.455 σ=1.57e-02/0.00e+00
Epoch 50: train=0.009 test=0.010 λ=1.476
Epoch 60: train=0.009 test=0.010 λ=1.473
Epoch 70: train=0.010 test=0.010 λ=1.476 σ=1.34e-03/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.479
Epoch 90: train=0.009 test=0.010 λ=1.480
Epoch 100: train=0.009 test=0.010 λ=1.480
Epoch 110: train=0.009 test=0.010 λ=1.482
Epoch 120: train=0.009 test=0.010 λ=1.481
Epoch 130: train=0.009 test=0.010 λ=1.481
Epoch 140: train=0.009 test=0.010 λ=1.480
Epoch 150: train=0.010 test=0.010 λ=1.481
Best test acc: 0.089
============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
Vanilla: depth=8, params=4,892,196
Epoch 10: train=0.393 test=0.335 σ=7.83e-01/3.02e-08
Epoch 20: train=0.547 test=0.453 σ=4.71e-01/2.14e-08
Epoch 30: train=0.629 test=0.462 σ=3.73e-01/1.78e-08
Epoch 40: train=0.701 test=0.500 σ=3.33e-01/1.55e-08
Epoch 50: train=0.752 test=0.517 σ=3.06e-01/1.40e-08
Epoch 60: train=0.798 test=0.528 σ=2.76e-01/1.25e-08
Epoch 70: train=0.835 test=0.530 σ=2.70e-01/1.18e-08
Epoch 80: train=0.872 test=0.534 σ=2.51e-01/1.07e-08
Epoch 90: train=0.897 test=0.538 σ=2.38e-01/9.81e-09
Epoch 100: train=0.920 test=0.536 σ=2.26e-01/9.08e-09
Epoch 110: train=0.935 test=0.544 σ=2.06e-01/8.19e-09
Epoch 120: train=0.947 test=0.551 σ=2.11e-01/7.87e-09
Epoch 130: train=0.954 test=0.551 σ=2.02e-01/7.42e-09
Epoch 140: train=0.956 test=0.553 σ=1.92e-01/7.38e-09
Epoch 150: train=0.956 test=0.546 σ=2.05e-01/7.55e-09
Best test acc: 0.554
Lyapunov: depth=8, params=4,892,196
Epoch 10: train=0.031 test=0.010 λ=1.573 σ=4.25e-01/1.02e-08
Epoch 20: train=0.019 test=0.010 λ=1.580 σ=2.80e-01/1.31e-09
Epoch 30: train=0.010 test=0.010 λ=1.519 σ=1.87e-01/8.73e-13
Epoch 40: train=0.009 test=0.010 λ=1.532 σ=7.89e-02/1.22e-26
Epoch 50: train=0.010 test=0.010 λ=1.536 σ=6.60e-02/4.34e-19
Epoch 60: train=0.009 test=0.010 λ=1.542 σ=5.19e-02/0.00e+00
Epoch 70: train=0.010 test=0.010 λ=1.540 σ=2.67e-02/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.544 σ=6.28e-03/0.00e+00
Epoch 90: train=0.009 test=0.010 λ=1.542 σ=4.08e-02/0.00e+00
Epoch 100: train=0.010 test=0.010 λ=1.543
Epoch 110: train=0.009 test=0.010 λ=1.543
Epoch 120: train=0.010 test=0.010 λ=1.543 σ=3.14e-03/0.00e+00
Epoch 130: train=0.009 test=0.010 λ=1.543
Epoch 140: train=0.010 test=0.010 λ=1.545
Epoch 150: train=0.010 test=0.010 λ=1.542
Best test acc: 0.020
============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
Vanilla: depth=12, params=8,027,556
Epoch 10: train=0.214 test=0.062 σ=6.26e-01/2.24e-08
Epoch 20: train=0.293 test=0.061 σ=3.27e-01/1.57e-08
Epoch 30: train=0.335 test=0.075 σ=2.66e-01/1.35e-08
Epoch 40: train=0.374 test=0.066 σ=2.32e-01/1.23e-08
Epoch 50: train=0.407 test=0.063 σ=2.29e-01/1.23e-08
Epoch 60: train=0.442 test=0.078 σ=2.18e-01/1.20e-08
Epoch 70: train=0.472 test=0.098 σ=2.30e-01/1.20e-08
Epoch 80: train=0.501 test=0.108 σ=2.21e-01/1.21e-08
Epoch 90: train=0.532 test=0.110 σ=2.24e-01/1.18e-08
Epoch 100: train=0.557 test=0.098 σ=2.25e-01/1.20e-08
Epoch 110: train=0.579 test=0.105 σ=2.26e-01/1.18e-08
Epoch 120: train=0.599 test=0.104 σ=2.31e-01/1.20e-08
Epoch 130: train=0.609 test=0.117 σ=2.28e-01/1.19e-08
Epoch 140: train=0.619 test=0.118 σ=2.23e-01/1.17e-08
Epoch 150: train=0.620 test=0.112 σ=2.30e-01/1.19e-08
Best test acc: 0.122
Lyapunov: depth=12, params=8,027,556
Epoch 10: train=0.017 test=0.010 λ=1.619 σ=3.10e-01/2.44e-12
Epoch 20: train=0.014 test=0.010 λ=1.620 σ=3.48e-01/1.80e-12
Epoch 30: train=0.010 test=0.010 λ=1.551 σ=1.95e-02/1.07e-16
Epoch 40: train=0.010 test=0.010 λ=1.556 σ=2.86e-02/0.00e+00
Epoch 50: train=0.010 test=0.010 λ=1.552 σ=1.02e-01/3.07e-15
Epoch 60: train=0.009 test=0.010 λ=1.560 σ=5.22e-02/0.00e+00
Epoch 70: train=0.009 test=0.010 λ=1.567 σ=3.97e-02/0.00e+00
Epoch 80: train=0.009 test=0.010 λ=1.564
Epoch 90: train=0.009 test=0.010 λ=1.570 σ=1.20e-02/0.00e+00
Epoch 100: train=0.009 test=0.010 λ=1.566
Epoch 110: train=0.009 test=0.010 λ=1.568 σ=1.35e-02/0.00e+00
Epoch 120: train=0.009 test=0.010 λ=1.566
Epoch 130: train=0.009 test=0.010 λ=1.568
Epoch 140: train=0.009 test=0.010 λ=1.566
Epoch 150: train=0.010 test=0.010 λ=1.565
Best test acc: 0.016
============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
Vanilla: depth=16, params=11,162,916
Epoch 10: train=0.091 test=0.011 σ=4.40e-01/1.32e-08
Epoch 20: train=0.133 test=0.015 σ=2.83e-01/1.07e-08
Epoch 30: train=0.156 test=0.018 σ=2.23e-01/9.48e-09
Epoch 40: train=0.176 test=0.018 σ=1.95e-01/9.15e-09
Epoch 50: train=0.191 test=0.021 σ=1.80e-01/8.93e-09
Epoch 60: train=0.204 test=0.022 σ=1.71e-01/8.85e-09
Epoch 70: train=0.218 test=0.028 σ=1.62e-01/9.03e-09
Epoch 80: train=0.227 test=0.025 σ=1.64e-01/8.92e-09
Epoch 90: train=0.238 test=0.028 σ=1.60e-01/9.11e-09
Epoch 100: train=0.249 test=0.028 σ=1.63e-01/9.28e-09
Epoch 110: train=0.257 test=0.031 σ=1.60e-01/9.29e-09
Epoch 120: train=0.265 test=0.027 σ=1.65e-01/9.32e-09
Epoch 130: train=0.270 test=0.028 σ=1.62e-01/9.29e-09
Epoch 140: train=0.273 test=0.026 σ=1.66e-01/9.44e-09
Epoch 150: train=0.274 test=0.026 σ=1.70e-01/9.40e-09
Best test acc: 0.033
Lyapunov: depth=16, params=11,162,916
Epoch 10: train=0.011 test=0.010 λ=1.693 σ=2.40e-01/9.58e-25
Epoch 20: train=0.009 test=0.010 λ=1.575 σ=1.05e-01/7.85e-12
Epoch 30: train=0.010 test=0.010 λ=1.590 σ=2.47e-01/2.27e-12
Epoch 40: train=0.009 test=0.010 λ=1.600 σ=4.94e-02/6.93e-15
Epoch 50: train=0.009 test=0.010 λ=1.594 σ=7.66e-02/2.70e-17
Epoch 60: train=0.010 test=0.010 λ=1.592 σ=5.57e-02/0.00e+00
Epoch 70: train=0.010 test=0.010 λ=1.595 σ=5.96e-02/0.00e+00
Epoch 80: train=0.010 test=0.010 λ=1.592 σ=4.24e-02/0.00e+00
Epoch 90: train=0.010 test=0.010 λ=1.594 σ=3.31e-02/0.00e+00
Epoch 100: train=0.011 test=0.010 λ=1.597 σ=3.36e-02/0.00e+00
Epoch 110: train=0.010 test=0.010 λ=1.599 σ=2.33e-02/0.00e+00
Epoch 120: train=0.010 test=0.010 λ=1.593
Epoch 130: train=0.010 test=0.010 λ=1.595 σ=1.07e-02/0.00e+00
Epoch 140: train=0.010 test=0.010 λ=1.593 σ=6.22e-03/0.00e+00
Epoch 150: train=0.010 test=0.010 λ=1.590
Best test acc: 0.011
====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth Vanilla Acc Lyapunov Acc Δ Acc Lyap λ Van ∇norm Lyap ∇norm Van κ
----------------------------------------------------------------------------------------------------
4 0.618 0.010 -0.608 1.481 4.60e-01 8.81e-02 1.2e+09
8 0.546 0.010 -0.536 1.542 3.79e-01 1.58e-01 4.7e+09
12 0.112 0.010 -0.102 1.565 6.41e-01 9.04e-02 3.8e+07
16 0.026 0.010 -0.016 1.590 5.09e-01 2.32e+00 3.0e+07
====================================================================================================
GRADIENT HEALTH ANALYSIS:
Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
KEY OBSERVATIONS:
Vanilla 4→16 layers: -0.592 accuracy change
Lyapunov 4→16 layers: +0.000 accuracy change
✓ Lyapunov regularization enables better depth scaling!
Results saved to runs/depth_scaling_asymm/cifar100_20260101-112330
============================================================
Finished: Thu Jan 1 11:23:32 CST 2026
============================================================
|