summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/15112871_weak_reg.out
blob: c2c26f79ae8953b27d9db198586d470347fa26cc (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
============================================================
WEAK REGULARIZATION Experiment (lambda_reg=0.01)
Job ID: 15112871 | Node: gpub023
Start: Thu Jan  1 12:26:50 CST 2026
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.01, λ_target: -0.1
Reg type: squared, Warmup epochs: 20
Device: cuda
================================================================================

Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000

Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: squared
Warmup epochs: 20
Stable init: False

============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
    Vanilla: depth=4, params=1,756,836
      Epoch  10: train=0.498 test=0.419  σ=9.41e-01/3.52e-08
      Epoch  20: train=0.628 test=0.476  σ=5.85e-01/2.43e-08
      Epoch  30: train=0.704 test=0.536  σ=4.86e-01/2.02e-08
      Epoch  40: train=0.756 test=0.544  σ=4.13e-01/1.73e-08
      Epoch  50: train=0.800 test=0.569  σ=3.81e-01/1.57e-08
      Epoch  60: train=0.833 test=0.560  σ=3.37e-01/1.37e-08
      Epoch  70: train=0.863 test=0.585  σ=3.17e-01/1.29e-08
      Epoch  80: train=0.885 test=0.595  σ=3.04e-01/1.22e-08
      Epoch  90: train=0.904 test=0.601  σ=2.80e-01/1.08e-08
      Epoch 100: train=0.923 test=0.599  σ=2.68e-01/1.02e-08
      Epoch 110: train=0.935 test=0.613  σ=2.64e-01/9.79e-09
      Epoch 120: train=0.945 test=0.606  σ=2.43e-01/8.88e-09
      Epoch 130: train=0.948 test=0.612  σ=2.48e-01/9.01e-09
      Epoch 140: train=0.952 test=0.616  σ=2.24e-01/8.47e-09
      Epoch 150: train=0.952 test=0.616  σ=2.31e-01/8.63e-09
      Best test acc: 0.618
    Lyapunov: depth=4, params=1,756,836
      Epoch  10: train=0.461 test=0.286 λ=1.949 σ=9.11e-01/3.46e-08
      Epoch  20: train=0.458 test=0.010 λ=1.465 σ=5.22e-01/2.10e-08
      Epoch  30: train=0.513 test=0.017 λ=1.736 σ=4.33e-01/1.78e-08
      Epoch  40: train=0.558 test=0.010 λ=1.767 σ=3.64e-01/1.59e-08
      Epoch  50: train=0.592 test=0.010 λ=1.791 σ=3.31e-01/1.49e-08
      Epoch  60: train=0.627 test=0.016 λ=1.766 σ=3.16e-01/1.43e-08
      Epoch  70: train=0.658 test=0.011 λ=1.765 σ=3.10e-01/1.37e-08
      Epoch  80: train=0.681 test=0.015 λ=1.770 σ=2.97e-01/1.33e-08
      Epoch  90: train=0.705 test=0.012 λ=1.784 σ=2.85e-01/1.28e-08
      Epoch 100: train=0.730 test=0.012 λ=1.784 σ=2.86e-01/1.27e-08
      Epoch 110: train=0.747 test=0.013 λ=1.797 σ=2.87e-01/1.25e-08
      Epoch 120: train=0.757 test=0.014 λ=1.823 σ=2.73e-01/1.21e-08
      Epoch 130: train=0.771 test=0.013 λ=1.854 σ=2.70e-01/1.19e-08
      Epoch 140: train=0.772 test=0.013 λ=1.873 σ=2.67e-01/1.19e-08
      Epoch 150: train=0.777 test=0.012 λ=1.882 σ=2.76e-01/1.20e-08
      Best test acc: 0.333

============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
    Vanilla: depth=8, params=4,892,196
      Epoch  10: train=0.382 test=0.338  σ=9.40e-01/3.24e-08
      Epoch  20: train=0.545 test=0.436  σ=4.81e-01/2.17e-08
      Epoch  30: train=0.636 test=0.464  σ=3.88e-01/1.80e-08
      Epoch  40: train=0.695 test=0.507  σ=3.33e-01/1.58e-08
      Epoch  50: train=0.752 test=0.506  σ=3.07e-01/1.39e-08
      Epoch  60: train=0.793 test=0.520  σ=2.96e-01/1.29e-08
      Epoch  70: train=0.834 test=0.517  σ=2.68e-01/1.16e-08
      Epoch  80: train=0.870 test=0.524  σ=2.49e-01/1.06e-08
      Epoch  90: train=0.899 test=0.526  σ=2.41e-01/9.69e-09
      Epoch 100: train=0.917 test=0.527  σ=2.36e-01/9.43e-09
      Epoch 110: train=0.931 test=0.534  σ=2.25e-01/8.64e-09
      Epoch 120: train=0.945 test=0.535  σ=2.08e-01/7.82e-09
      Epoch 130: train=0.951 test=0.530  σ=2.02e-01/7.38e-09
      Epoch 140: train=0.954 test=0.535  σ=2.02e-01/7.62e-09
      Epoch 150: train=0.957 test=0.520  σ=2.01e-01/7.60e-09
      Best test acc: 0.543
    Lyapunov: depth=8, params=4,892,196
      Epoch  10: train=0.046 test=0.010 λ=1.570 σ=4.09e-01/1.23e-08
      Epoch  20: train=0.062 test=0.010 λ=1.569 σ=2.46e-01/7.84e-09
      Epoch  30: train=0.069 test=0.010 λ=1.534 σ=1.81e-01/6.62e-09
      Epoch  40: train=0.046 test=0.010 λ=1.562 σ=1.49e-01/4.37e-09
      Epoch  50: train=0.057 test=0.010 λ=1.531 σ=1.53e-01/4.61e-09
      Epoch  60: train=0.040 test=0.010 λ=1.538 σ=1.53e-01/3.35e-09
      Epoch  70: train=0.046 test=0.010 λ=1.536 σ=1.19e-01/1.75e-09
      Epoch  80: train=0.050 test=0.010 λ=1.534 σ=1.19e-01/2.22e-09
      Epoch  90: train=0.062 test=0.010 λ=1.556 σ=1.18e-01/3.98e-09
      Epoch 100: train=0.048 test=0.010 λ=1.530 σ=1.14e-01/1.46e-09
      Epoch 110: train=0.055 test=0.010 λ=1.534 σ=1.11e-01/3.03e-09
      Epoch 120: train=0.075 test=0.010 λ=1.539 σ=1.12e-01/4.79e-09
      Epoch 130: train=0.079 test=0.010 λ=1.593 σ=1.20e-01/4.96e-09
      Epoch 140: train=0.076 test=0.010 λ=1.584 σ=1.13e-01/4.96e-09
      Epoch 150: train=0.077 test=0.010 λ=1.583 σ=1.15e-01/4.98e-09
      Best test acc: 0.014

============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
    Vanilla: depth=12, params=8,027,556
      Epoch  10: train=0.216 test=0.059  σ=7.22e-01/2.38e-08
      Epoch  20: train=0.291 test=0.044  σ=3.35e-01/1.60e-08
      Epoch  30: train=0.339 test=0.048  σ=2.71e-01/1.39e-08
      Epoch  40: train=0.377 test=0.055  σ=2.37e-01/1.27e-08
      Epoch  50: train=0.412 test=0.040  σ=2.25e-01/1.23e-08
      Epoch  60: train=0.440 test=0.044  σ=2.24e-01/1.23e-08
      Epoch  70: train=0.471 test=0.048  σ=2.28e-01/1.19e-08
      Epoch  80: train=0.497 test=0.060  σ=2.25e-01/1.23e-08
      Epoch  90: train=0.533 test=0.069  σ=2.24e-01/1.19e-08
      Epoch 100: train=0.563 test=0.079  σ=2.24e-01/1.20e-08
      Epoch 110: train=0.580 test=0.058  σ=2.28e-01/1.19e-08
      Epoch 120: train=0.602 test=0.056  σ=2.30e-01/1.19e-08
      Epoch 130: train=0.608 test=0.070  σ=2.29e-01/1.18e-08
      Epoch 140: train=0.616 test=0.068  σ=2.27e-01/1.18e-08
      Epoch 150: train=0.620 test=0.064  σ=2.28e-01/1.22e-08
      Best test acc: 0.079
    Lyapunov: depth=12, params=8,027,556
      Epoch  10: train=0.017 test=0.010 λ=1.584 σ=2.89e-01/5.97e-12
      Epoch  20: train=0.012 test=0.010 λ=1.566 σ=2.21e-01/1.75e-20
      Epoch  30: train=0.012 test=0.010 λ=1.567 σ=3.65e-01/7.23e-20
      Epoch  40: train=0.021 test=0.010 λ=1.623 σ=2.45e-01/8.70e-13
      Epoch  50: train=0.022 test=0.010 λ=1.660 σ=1.84e-01/9.38e-13
      Epoch  60: train=0.020 test=0.010 λ=1.695 σ=1.61e-01/5.37e-13
      Epoch  70: train=0.019 test=0.010 λ=1.635 σ=1.40e-01/1.78e-12
      Epoch  80: train=0.018 test=0.010 λ=1.641 σ=1.37e-01/2.32e-12
      Epoch  90: train=0.025 test=0.010 λ=1.637 σ=1.37e-01/1.13e-09
      Epoch 100: train=0.027 test=0.010 λ=1.684 σ=1.29e-01/1.39e-09
      Epoch 110: train=0.022 test=0.010 λ=1.779 σ=1.13e-01/1.11e-10
      Epoch 120: train=0.022 test=0.010 λ=1.769 σ=1.08e-01/1.12e-11
      Epoch 130: train=0.021 test=0.010 λ=1.888 σ=9.60e-02/3.75e-12
      Epoch 140: train=0.021 test=0.010 λ=1.788 σ=1.00e-01/9.24e-12
      Epoch 150: train=0.022 test=0.010 λ=1.799 σ=9.76e-02/4.48e-12
      Best test acc: 0.010

============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
    Vanilla: depth=16, params=11,162,916
      Epoch  10: train=0.091 test=0.011  σ=4.40e-01/1.32e-08
      Epoch  20: train=0.133 test=0.015  σ=2.83e-01/1.07e-08
      Epoch  30: train=0.156 test=0.018  σ=2.23e-01/9.48e-09
      Epoch  40: train=0.177 test=0.022  σ=2.04e-01/9.14e-09
      Epoch  50: train=0.191 test=0.024  σ=1.78e-01/8.86e-09
      Epoch  60: train=0.203 test=0.031  σ=1.74e-01/9.04e-09
      Epoch  70: train=0.219 test=0.026  σ=1.62e-01/8.97e-09
      Epoch  80: train=0.229 test=0.032  σ=1.63e-01/8.94e-09
      Epoch  90: train=0.242 test=0.031  σ=1.60e-01/9.16e-09
      Epoch 100: train=0.251 test=0.027  σ=1.62e-01/9.14e-09
      Epoch 110: train=0.259 test=0.032  σ=1.58e-01/9.11e-09
      Epoch 120: train=0.264 test=0.028  σ=1.64e-01/9.10e-09
      Epoch 130: train=0.271 test=0.029  σ=1.61e-01/9.33e-09
      Epoch 140: train=0.272 test=0.031  σ=1.64e-01/9.34e-09
      Epoch 150: train=0.272 test=0.028  σ=1.66e-01/9.31e-09
      Best test acc: 0.035
    Lyapunov: depth=16, params=11,162,916
      Epoch  10: train=0.014 test=0.010 λ=1.722 σ=2.76e-01/4.41e-13
      Epoch  20: train=0.010 test=0.010 λ=1.723 σ=3.64e-01/5.20e-17
      Epoch  30: train=0.011 test=0.010 λ=1.721 σ=8.95e-02/2.45e-17
      Epoch  40: train=0.012 test=0.010 λ=1.787 σ=1.74e-01/5.48e-14
      Epoch  50: train=0.014 test=0.010 λ=1.672 σ=1.88e-01/1.05e-14
      Epoch  60: train=0.011 test=0.010 λ=1.976 σ=9.53e-02/1.33e-14
      Epoch  70: train=0.011 test=0.010 λ=1.787 σ=9.06e-02/1.54e-14
      Epoch  80: train=0.012 test=0.011 λ=1.825 σ=1.01e-01/4.31e-14
      Epoch  90: train=0.010 test=0.010 λ=1.829 σ=1.48e-01/4.61e-13
      Epoch 100: train=0.010 test=0.010 λ=1.605 σ=1.04e-01/1.42e-13
      Epoch 110: train=0.010 test=0.010 λ=1.615 σ=1.21e-01/1.69e-14
      Epoch 120: train=0.009 test=0.010 λ=1.613 σ=1.09e-01/1.04e-14
      Epoch 130: train=0.010 test=0.010 λ=1.604 σ=5.06e-02/2.83e-24
      Epoch 140: train=0.010 test=0.010 λ=1.622 σ=5.64e-02/0.00e+00
      Epoch 150: train=0.010 test=0.010 λ=1.584 σ=2.54e-02/0.00e+00
      Best test acc: 0.014

====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth    Vanilla Acc  Lyapunov Acc Δ Acc    Lyap λ     Van ∇norm    Lyap ∇norm   Van κ     
----------------------------------------------------------------------------------------------------
4        0.616        0.012        -0.603   1.882      4.59e-01     6.55e-01     2.2e+08   
8        0.520        0.010        -0.510   1.583      3.83e-01     3.29e-01     2.8e+08   
12       0.064        0.010        -0.054   1.799      6.38e-01     2.04e-01     2.3e+07   
16       0.028        0.010        -0.018   1.584      5.05e-01     3.21e-01     2.1e+07   
====================================================================================================

GRADIENT HEALTH ANALYSIS:
  Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)


KEY OBSERVATIONS:
  Vanilla  4→16 layers: -0.588 accuracy change
  Lyapunov 4→16 layers: -0.002 accuracy change
  ✓ Lyapunov regularization enables better depth scaling!

Results saved to runs/depth_scaling_weak_reg/cifar100_20260102-133933
============================================================
Finished: Fri Jan  2 13:39:37 CST 2026
============================================================