summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112873_stable_init.out
blob: eacc0790f241bd8d83d10ac72368bcd4aab466da (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
============================================================
STABLE INITIALIZATION Experiment
Job ID: 15112873 | Node: gpub011
Start: Thu Jan  1 12:26:50 CST 2026
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.1, λ_target: -0.1
Reg type: squared, Warmup epochs: 20
Device: cuda
================================================================================

Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000

Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: squared
Warmup epochs: 20
Stable init: True

============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
    Vanilla: depth=4, params=1,756,836
      Epoch  10: train=0.516 test=0.431  σ=9.10e-01/3.50e-08
      Epoch  20: train=0.640 test=0.517  σ=5.84e-01/2.47e-08
      Epoch  30: train=0.712 test=0.558  σ=4.82e-01/2.04e-08
      Epoch  40: train=0.761 test=0.558  σ=4.07e-01/1.72e-08
      Epoch  50: train=0.800 test=0.577  σ=3.76e-01/1.54e-08
      Epoch  60: train=0.837 test=0.581  σ=3.34e-01/1.38e-08
      Epoch  70: train=0.864 test=0.579  σ=3.25e-01/1.29e-08
      Epoch  80: train=0.888 test=0.592  σ=2.91e-01/1.17e-08
      Epoch  90: train=0.907 test=0.602  σ=2.89e-01/1.10e-08
      Epoch 100: train=0.921 test=0.604  σ=2.71e-01/1.05e-08
      Epoch 110: train=0.935 test=0.606  σ=2.64e-01/9.89e-09
      Epoch 120: train=0.943 test=0.617  σ=2.46e-01/9.41e-09
      Epoch 130: train=0.950 test=0.615  σ=2.45e-01/8.86e-09
      Epoch 140: train=0.951 test=0.615  σ=2.29e-01/8.67e-09
      Epoch 150: train=0.953 test=0.615  σ=2.36e-01/8.51e-09
      Best test acc: 0.620
    Lyapunov: depth=4, params=1,756,836
      Epoch  10: train=0.195 test=0.014 λ=1.549 σ=7.07e-01/2.57e-08
      Epoch  20: train=0.135 test=0.012 λ=1.570 σ=4.14e-01/1.49e-08
      Epoch  30: train=0.057 test=0.010 λ=1.488 σ=2.46e-01/6.99e-09
      Epoch  40: train=0.067 test=0.010 λ=1.481 σ=2.03e-01/6.20e-09
      Epoch  50: train=0.048 test=0.010 λ=1.877 σ=1.80e-01/4.00e-09
      Epoch  60: train=0.009 test=0.010 λ=1.462 σ=4.58e-02/0.00e+00
      Epoch  70: train=0.010 test=0.010 λ=1.467 σ=3.57e-02/0.00e+00
      Epoch  80: train=0.010 test=0.010 λ=1.471 σ=1.33e-02/0.00e+00
      Epoch  90: train=0.009 test=0.010 λ=1.471 σ=4.82e-03/0.00e+00
      Epoch 100: train=0.009 test=0.010 λ=1.471 σ=1.18e-03/0.00e+00
      Epoch 110: train=0.009 test=0.010 λ=1.471 σ=4.32e-03/0.00e+00
      Epoch 120: train=0.009 test=0.010 λ=1.472 
      Epoch 130: train=0.010 test=0.010 λ=1.472 
      Epoch 140: train=0.010 test=0.010 λ=1.471 
      Epoch 150: train=0.010 test=0.010 λ=1.473 
      Best test acc: 0.106

============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
    Vanilla: depth=8, params=4,892,196
      Epoch  10: train=0.451 test=0.402  σ=7.31e-01/2.99e-08
      Epoch  20: train=0.587 test=0.471  σ=4.69e-01/2.12e-08
      Epoch  30: train=0.666 test=0.493  σ=3.81e-01/1.75e-08
      Epoch  40: train=0.728 test=0.505  σ=3.27e-01/1.53e-08
      Epoch  50: train=0.774 test=0.533  σ=3.18e-01/1.40e-08
      Epoch  60: train=0.812 test=0.521  σ=2.93e-01/1.28e-08
      Epoch  70: train=0.852 test=0.547  σ=2.81e-01/1.17e-08
      Epoch  80: train=0.884 test=0.531  σ=2.48e-01/1.02e-08
      Epoch  90: train=0.906 test=0.537  σ=2.35e-01/9.46e-09
      Epoch 100: train=0.927 test=0.553  σ=2.24e-01/8.84e-09
      Epoch 110: train=0.941 test=0.552  σ=2.09e-01/8.04e-09
      Epoch 120: train=0.951 test=0.553  σ=2.09e-01/7.55e-09
      Epoch 130: train=0.959 test=0.553  σ=2.10e-01/7.39e-09
      Epoch 140: train=0.959 test=0.561  σ=1.95e-01/7.19e-09
      Epoch 150: train=0.961 test=0.551  σ=1.94e-01/6.97e-09
      Best test acc: 0.564
    Lyapunov: depth=8, params=4,892,196
      Epoch  10: train=0.046 test=0.010 λ=1.543 σ=3.90e-01/9.92e-09
      Epoch  20: train=0.038 test=0.010 λ=1.533 σ=2.42e-01/4.88e-09
      Epoch  30: train=0.038 test=0.010 λ=1.623 σ=1.93e-01/3.39e-09
      Epoch  40: train=0.028 test=0.010 λ=1.706 σ=1.66e-01/2.06e-09
      Epoch  50: train=0.009 test=0.010 λ=1.532 σ=7.89e-02/1.54e-17
      Epoch  60: train=0.010 test=0.010 λ=1.540 σ=4.28e-02/5.11e-27
      Epoch  70: train=0.009 test=0.010 λ=1.544 σ=4.22e-02/0.00e+00
      Epoch  80: train=0.010 test=0.010 λ=1.548 σ=3.81e-02/0.00e+00
      Epoch  90: train=0.011 test=0.010 λ=1.554 σ=3.03e-02/0.00e+00
      Epoch 100: train=0.010 test=0.010 λ=1.549 σ=9.40e-03/0.00e+00
      Epoch 110: train=0.010 test=0.010 λ=1.549 σ=5.91e-03/0.00e+00
      Epoch 120: train=0.010 test=0.010 λ=1.548 σ=3.83e-03/0.00e+00
      Epoch 130: train=0.010 test=0.010 λ=1.549 σ=7.81e-03/0.00e+00
      Epoch 140: train=0.010 test=0.010 λ=1.549 σ=1.37e-02/0.00e+00
      Epoch 150: train=0.010 test=0.010 λ=1.546 σ=8.69e-03/0.00e+00
      Best test acc: 0.021

============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
    Vanilla: depth=12, params=8,027,556
      Epoch  10: train=0.253 test=0.046  σ=4.96e-01/2.03e-08
      Epoch  20: train=0.322 test=0.044  σ=3.35e-01/1.58e-08
      Epoch  30: train=0.364 test=0.054  σ=2.77e-01/1.38e-08
      Epoch  40: train=0.404 test=0.046  σ=2.49e-01/1.30e-08
      Epoch  50: train=0.439 test=0.062  σ=2.30e-01/1.24e-08
      Epoch  60: train=0.469 test=0.040  σ=2.30e-01/1.24e-08
      Epoch  70: train=0.498 test=0.054  σ=2.35e-01/1.21e-08
      Epoch  80: train=0.532 test=0.058  σ=2.26e-01/1.20e-08
      Epoch  90: train=0.565 test=0.072  σ=2.26e-01/1.18e-08
      Epoch 100: train=0.276 test=0.099  σ=1.92e-01/1.10e-08
      Epoch 110: train=0.409 test=0.123  σ=2.13e-01/1.20e-08
      Epoch 120: train=0.470 test=0.124  σ=2.27e-01/1.20e-08
      Epoch 130: train=0.495 test=0.146  σ=2.19e-01/1.22e-08
      Epoch 140: train=0.510 test=0.138  σ=2.15e-01/1.17e-08
      Epoch 150: train=0.512 test=0.118  σ=2.18e-01/1.17e-08
      Best test acc: 0.146
    Lyapunov: depth=12, params=8,027,556
      Epoch  10: train=0.011 test=0.010 λ=1.563 σ=5.46e-01/7.17e-09
      Epoch  20: train=0.010 test=0.010 λ=1.556 σ=8.74e-02/8.70e-15
      Epoch  30: train=0.010 test=0.010 λ=1.554 σ=9.58e-02/3.05e-15
      Epoch  40: train=0.009 test=0.010 λ=1.566 σ=6.06e-02/2.31e-34
      Epoch  50: train=0.010 test=0.010 λ=1.566 σ=3.46e-02/0.00e+00
      Epoch  60: train=0.009 test=0.010 λ=1.573 σ=4.50e-02/0.00e+00
      Epoch  70: train=0.010 test=0.010 λ=1.572 σ=1.34e-02/0.00e+00
      Epoch  80: train=0.009 test=0.010 λ=1.575 σ=6.32e-04/0.00e+00
      Epoch  90: train=0.009 test=0.010 λ=1.576 σ=5.51e-02/0.00e+00
      Epoch 100: train=0.010 test=0.010 λ=1.579 σ=2.74e-02/0.00e+00
      Epoch 110: train=0.009 test=0.010 λ=1.575 σ=2.56e-02/0.00e+00
      Epoch 120: train=0.010 test=0.010 λ=1.576 σ=3.61e-02/0.00e+00
      Epoch 130: train=0.010 test=0.010 λ=1.576 
      Epoch 140: train=0.010 test=0.010 λ=1.574 σ=5.40e-03/0.00e+00
      Epoch 150: train=0.010 test=0.010 λ=1.569 
      Best test acc: 0.011

============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
    Vanilla: depth=16, params=11,162,916
      Epoch  10: train=0.120 test=0.020  σ=4.06e-01/1.45e-08
      Epoch  20: train=0.158 test=0.011  σ=2.71e-01/1.13e-08
      Epoch  30: train=0.182 test=0.016  σ=2.16e-01/1.00e-08
      Epoch  40: train=0.203 test=0.029  σ=2.01e-01/9.74e-09
      Epoch  50: train=0.220 test=0.025  σ=1.83e-01/9.59e-09
      Epoch  60: train=0.237 test=0.025  σ=1.78e-01/9.64e-09
      Epoch  70: train=0.250 test=0.029  σ=1.67e-01/9.64e-09
      Epoch  80: train=0.259 test=0.026  σ=1.65e-01/9.31e-09
      Epoch  90: train=0.273 test=0.022  σ=1.63e-01/9.65e-09
      Epoch 100: train=0.229 test=0.019  σ=1.52e-01/9.12e-09
      Epoch 110: train=0.256 test=0.024  σ=1.54e-01/9.41e-09
      Epoch 120: train=0.266 test=0.025  σ=1.60e-01/9.49e-09
      Epoch 130: train=0.277 test=0.025  σ=1.57e-01/9.48e-09
      Epoch 140: train=0.283 test=0.025  σ=1.61e-01/9.66e-09
      Epoch 150: train=0.283 test=0.024  σ=1.63e-01/9.63e-09
      Best test acc: 0.036
    Lyapunov: depth=16, params=11,162,916
      Epoch  10: train=0.011 test=0.010 λ=1.695 σ=3.65e-01/1.28e-13
      Epoch  20: train=0.011 test=0.010 λ=1.668 σ=3.46e-01/1.58e-14
      Epoch  30: train=0.011 test=0.010 λ=1.632 σ=1.93e-01/2.02e-20
      Epoch  40: train=0.009 test=0.010 λ=1.610 σ=2.17e-01/1.62e-12
      Epoch  50: train=0.010 test=0.010 λ=1.620 σ=1.54e-01/1.56e-15
      Epoch  60: train=0.011 test=0.010 λ=1.621 σ=5.15e-02/0.00e+00
      Epoch  70: train=0.009 test=0.010 λ=1.606 σ=1.16e-02/0.00e+00
      Epoch  80: train=0.009 test=0.010 λ=1.605 σ=1.80e-02/0.00e+00
      Epoch  90: train=0.009 test=0.010 λ=1.609 
      Epoch 100: train=0.009 test=0.010 λ=1.618 σ=5.85e-04/0.00e+00
      Epoch 110: train=0.009 test=0.010 λ=1.610 σ=5.90e-04/0.00e+00
      Epoch 120: train=0.009 test=0.010 λ=1.608 
      Epoch 130: train=0.009 test=0.010 λ=1.603 
      Epoch 140: train=0.010 test=0.010 λ=1.606 
      Epoch 150: train=0.010 test=0.010 λ=1.596 
      Best test acc: 0.016

====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth    Vanilla Acc  Lyapunov Acc Δ Acc    Lyap λ     Van ∇norm    Lyap ∇norm   Van κ     
----------------------------------------------------------------------------------------------------
4        0.615        0.010        -0.605   1.473      4.63e-01     8.84e-02     3.7e+08   
8        0.551        0.010        -0.541   1.546      3.64e-01     1.64e-01     2.7e+08   
12       0.118        0.010        -0.108   1.569      6.43e-01     6.98e-01     4.1e+07   
16       0.024        0.010        -0.014   1.596      5.19e-01     3.22e-01     2.7e+07   
====================================================================================================

GRADIENT HEALTH ANALYSIS:
  Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)


KEY OBSERVATIONS:
  Vanilla  4→16 layers: -0.591 accuracy change
  Lyapunov 4→16 layers: +0.000 accuracy change
  ✓ Lyapunov regularization enables better depth scaling!

Results saved to runs/depth_scaling_stable_init/cifar100_20260102-133755
============================================================
Finished: Fri Jan  2 13:37:56 CST 2026
============================================================