summaryrefslogtreecommitdiff
path: root/runs/slurm_logs/14632851_hinge.out
blob: 74e3d48b284f915266324dd1a48d1b2f4c6e6200 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
============================================================
HINGE LOSS Lyapunov Regularization
Job ID: 14632851 | Node: gpub050
Start: Wed Dec 31 10:16:37 CST 2025
============================================================
NVIDIA A40, 46068 MiB
============================================================
================================================================================
DEPTH SCALING BENCHMARK
================================================================================
Dataset: cifar100
Depths: [4, 8, 12, 16]
Timesteps: 4
Epochs: 150
λ_reg: 0.3, λ_target: -0.1
Reg type: hinge, Warmup epochs: 20
Device: cuda
================================================================================

Loading cifar100...
Classes: 100, Input: (3, 32, 32)
Train: 50000, Test: 10000

Depth configurations: [(4, '4×1'), (8, '4×2'), (12, '4×3'), (16, '4×4')]
Regularization type: hinge
Warmup epochs: 20

============================================================
Depth = 4 conv layers (4 stages × 1 blocks)
============================================================
    Vanilla: depth=4, params=1,756,836
      Epoch  10: train=0.491 test=0.381  σ=9.60e-01/3.58e-08
      Epoch  20: train=0.629 test=0.483  σ=5.82e-01/2.43e-08
      Epoch  30: train=0.705 test=0.553  σ=4.88e-01/2.04e-08
      Epoch  40: train=0.754 test=0.564  σ=4.23e-01/1.75e-08
      Epoch  50: train=0.797 test=0.572  σ=3.67e-01/1.54e-08
      Epoch  60: train=0.830 test=0.585  σ=3.46e-01/1.43e-08
      Epoch  70: train=0.861 test=0.591  σ=3.17e-01/1.26e-08
      Epoch  80: train=0.883 test=0.600  σ=2.94e-01/1.17e-08
      Epoch  90: train=0.904 test=0.603  σ=2.84e-01/1.10e-08
      Epoch 100: train=0.920 test=0.607  σ=2.68e-01/9.90e-09
      Epoch 110: train=0.933 test=0.615  σ=2.64e-01/9.87e-09
      Epoch 120: train=0.941 test=0.610  σ=2.47e-01/9.35e-09
      Epoch 130: train=0.947 test=0.615  σ=2.40e-01/8.75e-09
      Epoch 140: train=0.949 test=0.613  σ=2.43e-01/8.69e-09
      Epoch 150: train=0.950 test=0.612  σ=2.42e-01/8.41e-09
      Best test acc: 0.618
    Lyapunov: depth=4, params=1,756,836
      Epoch  10: train=0.061 test=0.010 λ=1.562 σ=5.93e-01/1.80e-08
      Epoch  20: train=0.010 test=0.010 λ=1.431 σ=2.01e-01/4.78e-11
      Epoch  30: train=0.009 test=0.010 λ=1.441 σ=4.39e-02/0.00e+00
      Epoch  40: train=0.010 test=0.010 λ=1.460 σ=2.30e-02/0.00e+00
      Epoch  50: train=0.009 test=0.010 λ=1.466 σ=2.12e-02/0.00e+00
      Epoch  60: train=0.009 test=0.010 λ=1.473 σ=1.82e-02/0.00e+00
      Epoch  70: train=0.010 test=0.010 λ=1.478 
      Epoch  80: train=0.009 test=0.010 λ=1.485 
      Epoch  90: train=0.009 test=0.010 λ=1.480 
      Epoch 100: train=0.009 test=0.010 λ=1.486 
      Epoch 110: train=0.009 test=0.010 λ=1.480 
      Epoch 120: train=0.009 test=0.010 λ=1.484 
      Epoch 130: train=0.009 test=0.010 λ=1.482 
      Epoch 140: train=0.009 test=0.010 λ=1.482 
      Epoch 150: train=0.010 test=0.010 λ=1.483 
      Best test acc: 0.086

============================================================
Depth = 8 conv layers (4 stages × 2 blocks)
============================================================
    Vanilla: depth=8, params=4,892,196
      Epoch  10: train=0.387 test=0.372  σ=8.41e-01/3.07e-08
      Epoch  20: train=0.548 test=0.442  σ=4.66e-01/2.12e-08
      Epoch  30: train=0.636 test=0.479  σ=3.74e-01/1.77e-08
      Epoch  40: train=0.701 test=0.507  σ=3.24e-01/1.54e-08
      Epoch  50: train=0.752 test=0.501  σ=3.10e-01/1.40e-08
      Epoch  60: train=0.797 test=0.517  σ=2.80e-01/1.21e-08
      Epoch  70: train=0.839 test=0.512  σ=2.65e-01/1.14e-08
      Epoch  80: train=0.870 test=0.517  σ=2.50e-01/1.05e-08
      Epoch  90: train=0.892 test=0.518  σ=2.40e-01/9.80e-09
      Epoch 100: train=0.916 test=0.521  σ=2.29e-01/9.12e-09
      Epoch 110: train=0.933 test=0.529  σ=2.20e-01/8.14e-09
      Epoch 120: train=0.945 test=0.538  σ=2.10e-01/7.94e-09
      Epoch 130: train=0.952 test=0.530  σ=2.06e-01/7.90e-09
      Epoch 140: train=0.955 test=0.533  σ=2.04e-01/7.30e-09
      Epoch 150: train=0.956 test=0.519  σ=2.03e-01/7.35e-09
      Best test acc: 0.539
    Lyapunov: depth=8, params=4,892,196
      Epoch  10: train=0.032 test=0.010 λ=1.539 σ=3.41e-01/8.08e-09
      Epoch  20: train=0.023 test=0.010 λ=1.554 σ=2.32e-01/2.95e-09
      Epoch  30: train=0.010 test=0.010 λ=1.511 σ=2.01e-01/3.19e-10
      Epoch  40: train=0.010 test=0.010 λ=1.525 σ=1.13e-01/7.10e-15
      Epoch  50: train=0.010 test=0.010 λ=1.533 σ=7.91e-02/2.05e-32
      Epoch  60: train=0.009 test=0.010 λ=1.537 σ=5.72e-02/0.00e+00
      Epoch  70: train=0.010 test=0.010 λ=1.540 σ=3.47e-02/0.00e+00
      Epoch  80: train=0.009 test=0.010 λ=1.542 
      Epoch  90: train=0.009 test=0.010 λ=1.542 σ=2.50e-02/0.00e+00
      Epoch 100: train=0.009 test=0.010 λ=1.545 σ=1.48e-03/0.00e+00
      Epoch 110: train=0.009 test=0.010 λ=1.542 
      Epoch 120: train=0.009 test=0.010 λ=1.544 
      Epoch 130: train=0.009 test=0.010 λ=1.542 
      Epoch 140: train=0.010 test=0.010 λ=1.542 
      Epoch 150: train=0.010 test=0.010 λ=1.542 
      Best test acc: 0.028

============================================================
Depth = 12 conv layers (4 stages × 3 blocks)
============================================================
    Vanilla: depth=12, params=8,027,556
      Epoch  10: train=0.212 test=0.049  σ=6.27e-01/2.25e-08
      Epoch  20: train=0.289 test=0.046  σ=3.31e-01/1.57e-08
      Epoch  30: train=0.335 test=0.066  σ=2.68e-01/1.36e-08
      Epoch  40: train=0.371 test=0.053  σ=2.37e-01/1.26e-08
      Epoch  50: train=0.404 test=0.039  σ=2.25e-01/1.22e-08
      Epoch  60: train=0.432 test=0.060  σ=2.23e-01/1.21e-08
      Epoch  70: train=0.464 test=0.054  σ=2.30e-01/1.19e-08
      Epoch  80: train=0.499 test=0.057  σ=2.25e-01/1.22e-08
      Epoch  90: train=0.524 test=0.056  σ=2.22e-01/1.20e-08
      Epoch 100: train=0.390 test=0.088  σ=2.28e-01/1.26e-08
      Epoch 110: train=0.502 test=0.043  σ=2.24e-01/1.19e-08
      Epoch 120: train=0.532 test=0.043  σ=2.29e-01/1.22e-08
      Epoch 130: train=0.548 test=0.046  σ=2.26e-01/1.20e-08
      Epoch 140: train=0.558 test=0.048  σ=2.28e-01/1.20e-08
      Epoch 150: train=0.558 test=0.042  σ=2.25e-01/1.21e-08
      Best test acc: 0.115
    Lyapunov: depth=12, params=8,027,556
      Epoch  10: train=0.010 test=0.010 λ=1.637 σ=6.08e-02/1.12e-13
      Epoch  20: train=0.009 test=0.010 λ=1.549 σ=2.84e-01/2.99e-09
      Epoch  30: train=0.010 test=0.010 λ=1.558 σ=1.03e-01/3.35e-15
      Epoch  40: train=0.010 test=0.010 λ=1.562 σ=1.67e-01/8.35e-10
      Epoch  50: train=0.010 test=0.010 λ=1.565 σ=5.77e-02/7.39e-40
      Epoch  60: train=0.010 test=0.010 λ=1.567 σ=3.33e-02/4.04e-19
      Epoch  70: train=0.010 test=0.010 λ=1.573 σ=5.17e-02/0.00e+00
      Epoch  80: train=0.009 test=0.010 λ=1.568 σ=2.22e-02/0.00e+00
      Epoch  90: train=0.009 test=0.010 λ=1.571 σ=5.06e-03/0.00e+00
      Epoch 100: train=0.009 test=0.010 λ=1.574 
      Epoch 110: train=0.009 test=0.010 λ=1.568 
      Epoch 120: train=0.010 test=0.010 λ=1.568 
      Epoch 130: train=0.010 test=0.010 λ=1.569 
      Epoch 140: train=0.010 test=0.010 λ=1.569 
      Epoch 150: train=0.010 test=0.010 λ=1.567 
      Best test acc: 0.013

============================================================
Depth = 16 conv layers (4 stages × 4 blocks)
============================================================
    Vanilla: depth=16, params=11,162,916
      Epoch  10: train=0.091 test=0.011  σ=4.40e-01/1.32e-08
      Epoch  20: train=0.134 test=0.013  σ=2.85e-01/1.08e-08
      Epoch  30: train=0.157 test=0.022  σ=2.23e-01/9.44e-09
      Epoch  40: train=0.178 test=0.025  σ=2.01e-01/8.99e-09
      Epoch  50: train=0.188 test=0.022  σ=1.84e-01/8.93e-09
      Epoch  60: train=0.201 test=0.027  σ=1.72e-01/8.67e-09
      Epoch  70: train=0.218 test=0.024  σ=1.61e-01/8.82e-09
      Epoch  80: train=0.227 test=0.025  σ=1.64e-01/8.80e-09
      Epoch  90: train=0.238 test=0.026  σ=1.57e-01/8.92e-09
      Epoch 100: train=0.249 test=0.026  σ=1.61e-01/9.00e-09
      Epoch 110: train=0.259 test=0.030  σ=1.58e-01/9.12e-09
      Epoch 120: train=0.263 test=0.028  σ=1.63e-01/9.20e-09
      Epoch 130: train=0.268 test=0.029  σ=1.59e-01/9.22e-09
      Epoch 140: train=0.272 test=0.029  σ=1.62e-01/9.16e-09
      Epoch 150: train=0.271 test=0.029  σ=1.66e-01/9.15e-09
      Best test acc: 0.032
    Lyapunov: depth=16, params=11,162,916
      Epoch  10: train=0.010 test=0.010 λ=1.686 σ=4.16e-01/5.36e-09
      Epoch  20: train=0.009 test=0.010 λ=1.575 σ=3.25e-01/3.25e-09
      Epoch  30: train=0.009 test=0.010 λ=1.582 σ=7.66e-04/0.00e+00
      Epoch  40: train=0.010 test=0.010 λ=1.584 σ=8.77e-02/3.12e-15
      Epoch  50: train=0.010 test=0.010 λ=1.595 σ=4.93e-02/2.68e-33
      Epoch  60: train=0.009 test=0.010 λ=1.589 σ=2.75e-02/0.00e+00
      Epoch  70: train=0.010 test=0.010 λ=1.589 σ=4.38e-02/0.00e+00
      Epoch  80: train=0.009 test=0.010 λ=1.585 σ=2.15e-02/0.00e+00
      Epoch  90: train=0.009 test=0.010 λ=1.585 σ=3.10e-02/0.00e+00
      Epoch 100: train=0.010 test=0.010 λ=1.585 σ=3.20e-02/0.00e+00
      Epoch 110: train=0.010 test=0.010 λ=1.584 σ=1.03e-02/0.00e+00
      Epoch 120: train=0.009 test=0.010 λ=1.587 
      Epoch 130: train=0.009 test=0.010 λ=1.586 
      Epoch 140: train=0.010 test=0.010 λ=1.585 
      Epoch 150: train=0.010 test=0.010 λ=1.585 
      Best test acc: 0.012

====================================================================================================
DEPTH SCALING RESULTS: CIFAR100
====================================================================================================
Depth    Vanilla Acc  Lyapunov Acc Δ Acc    Lyap λ     Van ∇norm    Lyap ∇norm   Van κ     
----------------------------------------------------------------------------------------------------
4        0.612        0.010        -0.602   1.483      4.68e-01     8.81e-02     4.9e+08   
8        0.519        0.010        -0.509   1.542      3.79e-01     1.49e-01     1.5e+09   
12       0.042        0.010        -0.032   1.567      6.45e-01     8.82e-02     3.5e+07   
16       0.029        0.010        -0.019   1.585      5.06e-01     1.78e-01     3.6e+08   
====================================================================================================

GRADIENT HEALTH ANALYSIS:
  Depth 4: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 8: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 12: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)
  Depth 16: ⚠️ Vanilla has ill-conditioned gradients (κ > 1e6)


KEY OBSERVATIONS:
  Vanilla  4→16 layers: -0.583 accuracy change
  Lyapunov 4→16 layers: +0.000 accuracy change
  ✓ Lyapunov regularization enables better depth scaling!

Results saved to runs/depth_scaling_hinge/cifar100_20260101-112306
============================================================
Finished: Thu Jan  1 11:23:09 CST 2026
============================================================