summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112872_target1.out
blob: b4babc0c6b195a5376c8730577df17d0c58b3d21 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
============================================================
SMARTER TARGET Experiment (lambda_target=1.0)
Job ID: 15112872 | Node: gpub074
Start: Thu Jan  1 12:26:50 CST 2026
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.1, λ_target: 1.0
Reg type: squared, Warmup epochs: 20
Device: cuda
================================================================================

Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000

Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: squared
Warmup epochs: 20
Stable init: False

============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
    Vanilla: depth=4, params=1,756,836
      Epoch  10: train=0.498 test=0.436  σ=9.42e-01/3.51e-08
      Epoch  20: train=0.629 test=0.499  σ=5.87e-01/2.46e-08
      Epoch  30: train=0.701 test=0.550  σ=4.77e-01/2.01e-08
      Epoch  40: train=0.753 test=0.545  σ=4.17e-01/1.75e-08
      Epoch  50: train=0.798 test=0.577  σ=3.69e-01/1.53e-08
      Epoch  60: train=0.830 test=0.582  σ=3.33e-01/1.39e-08
      Epoch  70: train=0.862 test=0.580  σ=3.29e-01/1.26e-08
      Epoch  80: train=0.884 test=0.578  σ=3.07e-01/1.21e-08
      Epoch  90: train=0.907 test=0.598  σ=2.79e-01/1.08e-08
      Epoch 100: train=0.921 test=0.608  σ=2.72e-01/1.06e-08
      Epoch 110: train=0.935 test=0.608  σ=2.54e-01/9.36e-09
      Epoch 120: train=0.943 test=0.608  σ=2.46e-01/9.12e-09
      Epoch 130: train=0.949 test=0.615  σ=2.34e-01/8.83e-09
      Epoch 140: train=0.951 test=0.610  σ=2.32e-01/8.64e-09
      Epoch 150: train=0.954 test=0.614  σ=2.32e-01/8.63e-09
      Best test acc: 0.615
    Lyapunov: depth=4, params=1,756,836
      Epoch  10: train=0.285 test=0.024 λ=1.483 σ=7.62e-01/2.92e-08
      Epoch  20: train=0.344 test=0.014 λ=1.560 σ=4.80e-01/2.01e-08
      Epoch  30: train=0.348 test=0.012 λ=1.679 σ=3.85e-01/1.69e-08
      Epoch  40: train=0.389 test=0.013 λ=1.635 σ=3.33e-01/1.54e-08
      Epoch  50: train=0.431 test=0.011 λ=1.635 σ=3.11e-01/1.45e-08
      Epoch  60: train=0.461 test=0.016 λ=1.622 σ=2.95e-01/1.44e-08
      Epoch  70: train=0.478 test=0.014 λ=1.660 σ=2.90e-01/1.40e-08
      Epoch  80: train=0.499 test=0.013 λ=1.657 σ=2.82e-01/1.40e-08
      Epoch  90: train=0.522 test=0.013 λ=1.663 σ=2.78e-01/1.36e-08
      Epoch 100: train=0.537 test=0.015 λ=1.678 σ=2.78e-01/1.36e-08
      Epoch 110: train=0.550 test=0.014 λ=1.684 σ=2.92e-01/1.38e-08
      Epoch 120: train=0.559 test=0.016 λ=1.704 σ=2.90e-01/1.42e-08
      Epoch 130: train=0.570 test=0.018 λ=1.709 σ=2.79e-01/1.36e-08
      Epoch 140: train=0.571 test=0.017 λ=1.865 σ=2.83e-01/1.37e-08
      Epoch 150: train=0.576 test=0.017 λ=1.816 σ=2.82e-01/1.37e-08
      Best test acc: 0.212

============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
    Vanilla: depth=8, params=4,892,196
      Epoch  10: train=0.388 test=0.351  σ=8.72e-01/3.16e-08
      Epoch  20: train=0.544 test=0.427  σ=4.73e-01/2.16e-08
      Epoch  30: train=0.630 test=0.466  σ=3.81e-01/1.79e-08
      Epoch  40: train=0.698 test=0.502  σ=3.22e-01/1.53e-08
      Epoch  50: train=0.747 test=0.519  σ=3.08e-01/1.42e-08
      Epoch  60: train=0.799 test=0.515  σ=2.87e-01/1.29e-08
      Epoch  70: train=0.836 test=0.524  σ=2.76e-01/1.18e-08
      Epoch  80: train=0.869 test=0.534  σ=2.44e-01/1.05e-08
      Epoch  90: train=0.898 test=0.528  σ=2.39e-01/9.52e-09
      Epoch 100: train=0.918 test=0.527  σ=2.29e-01/8.96e-09
      Epoch 110: train=0.933 test=0.542  σ=2.26e-01/8.58e-09
      Epoch 120: train=0.943 test=0.542  σ=2.09e-01/7.88e-09
      Epoch 130: train=0.951 test=0.545  σ=1.97e-01/7.80e-09
      Epoch 140: train=0.955 test=0.542  σ=2.06e-01/7.61e-09
      Epoch 150: train=0.954 test=0.535  σ=1.94e-01/7.46e-09
      Best test acc: 0.550
    Lyapunov: depth=8, params=4,892,196
      Epoch  10: train=0.035 test=0.010 λ=1.583 σ=3.25e-01/9.17e-09
      Epoch  20: train=0.049 test=0.010 λ=1.574 σ=2.46e-01/6.77e-09
      Epoch  30: train=0.061 test=0.010 λ=1.571 σ=2.03e-01/5.87e-09
      Epoch  40: train=0.033 test=0.010 λ=1.544 σ=1.80e-01/2.89e-09
      Epoch  50: train=0.030 test=0.010 λ=1.550 σ=1.59e-01/9.89e-10
      Epoch  60: train=0.030 test=0.010 λ=1.567 σ=1.39e-01/5.26e-10
      Epoch  70: train=0.029 test=0.010 λ=1.571 σ=1.16e-01/1.53e-10
      Epoch  80: train=0.041 test=0.010 λ=1.646 σ=1.41e-01/3.16e-09
      Epoch  90: train=0.036 test=0.010 λ=1.808 σ=1.37e-01/2.76e-09
      Epoch 100: train=0.031 test=0.010 λ=1.940 σ=1.71e-01/2.97e-09
      Epoch 110: train=0.047 test=0.010 λ=1.976 σ=1.42e-01/3.22e-09
      Epoch 120: train=0.047 test=0.008 λ=1.993 σ=1.26e-01/3.43e-09
      Epoch 130: train=0.046 test=0.010 λ=2.057 σ=1.50e-01/3.50e-09
      Epoch 140: train=0.026 test=0.010 λ=2.014 σ=2.43e-01/3.40e-09
      Epoch 150: train=0.031 test=0.010 λ=2.334 σ=1.30e-01/4.11e-09
      Best test acc: 0.024

============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
    Vanilla: depth=12, params=8,027,556
      Epoch  10: train=0.214 test=0.070  σ=5.69e-01/2.19e-08
      Epoch  20: train=0.289 test=0.057  σ=3.32e-01/1.63e-08
      Epoch  30: train=0.340 test=0.107  σ=2.62e-01/1.36e-08
      Epoch  40: train=0.374 test=0.083  σ=2.35e-01/1.28e-08
      Epoch  50: train=0.410 test=0.073  σ=2.28e-01/1.25e-08
      Epoch  60: train=0.436 test=0.101  σ=2.23e-01/1.23e-08
      Epoch  70: train=0.473 test=0.087  σ=2.33e-01/1.22e-08
      Epoch  80: train=0.505 test=0.083  σ=2.21e-01/1.22e-08
      Epoch  90: train=0.534 test=0.090  σ=2.24e-01/1.21e-08
      Epoch 100: train=0.561 test=0.096  σ=2.29e-01/1.23e-08
      Epoch 110: train=0.584 test=0.074  σ=2.30e-01/1.21e-08
      Epoch 120: train=0.602 test=0.088  σ=2.35e-01/1.22e-08
      Epoch 130: train=0.609 test=0.093  σ=2.30e-01/1.20e-08
      Epoch 140: train=0.620 test=0.094  σ=2.27e-01/1.19e-08
      Epoch 150: train=0.624 test=0.086  σ=2.31e-01/1.22e-08
      Best test acc: 0.109
    Lyapunov: depth=12, params=8,027,556
      Epoch  10: train=0.013 test=0.010 λ=1.639 σ=3.64e-01/1.06e-12
      Epoch  20: train=0.015 test=0.010 λ=1.598 σ=2.98e-01/1.14e-12
      Epoch  30: train=0.019 test=0.010 λ=1.630 σ=3.30e-01/3.22e-12
      Epoch  40: train=0.021 test=0.010 λ=1.592 σ=1.82e-01/2.19e-12
      Epoch  50: train=0.020 test=0.010 λ=1.658 σ=1.51e-01/2.96e-12
      Epoch  60: train=0.015 test=0.010 λ=1.616 σ=1.03e-01/2.55e-13
      Epoch  70: train=0.018 test=0.010 λ=1.617 σ=1.18e-01/4.53e-13
      Epoch  80: train=0.020 test=0.010 λ=1.636 σ=1.22e-01/4.69e-12
      Epoch  90: train=0.021 test=0.010 λ=1.593 σ=1.05e-01/7.58e-12
      Epoch 100: train=0.026 test=0.010 λ=1.593 σ=1.16e-01/7.00e-10
      Epoch 110: train=0.021 test=0.010 λ=1.590 σ=9.46e-02/4.97e-12
      Epoch 120: train=0.024 test=0.010 λ=1.740 σ=9.83e-02/3.89e-11
      Epoch 130: train=0.020 test=0.010 λ=1.901 σ=1.09e-01/9.81e-11
      Epoch 140: train=0.027 test=0.010 λ=1.972 σ=1.21e-01/1.96e-09
      Epoch 150: train=0.019 test=0.010 λ=2.112 σ=6.82e-02/1.40e-11
      Best test acc: 0.019

============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
    Vanilla: depth=16, params=11,162,916
      Epoch  10: train=0.091 test=0.011  σ=4.40e-01/1.32e-08
      Epoch  20: train=0.135 test=0.014  σ=2.84e-01/1.06e-08
      Epoch  30: train=0.157 test=0.017  σ=2.21e-01/9.39e-09
      Epoch  40: train=0.177 test=0.019  σ=1.93e-01/8.97e-09
      Epoch  50: train=0.191 test=0.024  σ=1.81e-01/9.00e-09
      Epoch  60: train=0.202 test=0.024  σ=1.74e-01/8.89e-09
      Epoch  70: train=0.215 test=0.026  σ=1.66e-01/8.97e-09
      Epoch  80: train=0.227 test=0.030  σ=1.65e-01/8.89e-09
      Epoch  90: train=0.240 test=0.026  σ=1.55e-01/8.93e-09
      Epoch 100: train=0.248 test=0.028  σ=1.61e-01/9.09e-09
      Epoch 110: train=0.254 test=0.031  σ=1.59e-01/9.18e-09
      Epoch 120: train=0.262 test=0.033  σ=1.64e-01/9.30e-09
      Epoch 130: train=0.267 test=0.033  σ=1.58e-01/9.22e-09
      Epoch 140: train=0.269 test=0.030  σ=1.61e-01/9.32e-09
      Epoch 150: train=0.269 test=0.030  σ=1.64e-01/9.32e-09
      Best test acc: 0.036
    Lyapunov: depth=16, params=11,162,916
      Epoch  10: train=0.014 test=0.010 λ=1.688 σ=3.64e-01/6.52e-13
      Epoch  20: train=0.010 test=0.010 λ=1.803 σ=3.56e-01/1.53e-13
      Epoch  30: train=0.012 test=0.002 λ=1.664 σ=4.44e-01/4.25e-14
      Epoch  40: train=0.018 test=0.010 λ=1.733 σ=2.04e-01/9.43e-13
      Epoch  50: train=0.010 test=0.010 λ=1.598 σ=1.55e-01/0.00e+00
      Epoch  60: train=0.010 test=0.010 λ=1.594 σ=4.41e-02/0.00e+00
      Epoch  70: train=0.009 test=0.010 λ=1.601 σ=7.50e-02/1.52e-14
      Epoch  80: train=0.012 test=0.010 λ=2.219 σ=5.78e-02/1.54e-42
      Epoch  90: train=0.016 test=0.010 λ=2.122 σ=9.91e-02/9.49e-15
      Epoch 100: train=0.017 test=0.010 λ=2.163 σ=1.05e-01/7.08e-13
      Epoch 110: train=0.020 test=0.010 λ=2.124 σ=1.10e-01/1.17e-12
      Epoch 120: train=0.010 test=0.010 λ=2.181 σ=6.50e-02/6.09e-15
      Epoch 130: train=0.010 test=0.010 λ=2.755 σ=6.77e-09/1.45e-20
      Epoch 140: train=0.016 test=0.010 λ=2.217 σ=1.13e-01/5.35e-14
      Epoch 150: train=0.018 test=0.010 λ=2.219 σ=1.21e-01/7.28e-14
      Best test acc: 0.012

====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth    Vanilla Acc  Lyapunov Acc Δ Acc    Lyap λ     Van ∇norm    Lyap ∇norm   Van κ     
----------------------------------------------------------------------------------------------------
4        0.614        0.017        -0.597   1.816      4.52e-01     7.31e-01     1.1e+09   
8        0.535        0.010        -0.525   2.334      3.87e-01     3.44e-01     8.0e+08   
12       0.086        0.010        -0.076   2.112      6.47e-01     2.28e+00     4.9e+07   
16       0.030        0.010        -0.020   2.219      5.07e-01     9.41e-01     2.1e+07   
====================================================================================================

GRADIENT HEALTH ANALYSIS:
  Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)


KEY OBSERVATIONS:
  Vanilla  4→16 layers: -0.584 accuracy change
  Lyapunov 4→16 layers: -0.007 accuracy change
  ✓ Lyapunov regularization enables better depth scaling!

Results saved to runs/depth_scaling_target1/cifar100_20260102-133339
============================================================
Finished: Fri Jan  2 13:33:43 CST 2026
============================================================