=== CIFAR-100 PROTOCOL VALIDATION ===
Start: Wed Apr 29 09:24:37 PM CDT 2026

--- BP + FA + DFA on CIFAR-100 ---
Using device: cuda:0

============================================================
Seed 42
============================================================

--- BP ---
  [BP] Epoch 1: loss=3.9752, train=0.0983, test=0.1432
  [BP] Epoch 10: loss=3.0848, train=0.2424, test=0.2492
  [BP] Epoch 20: loss=2.8041, train=0.2935, test=0.2883
  [BP] Epoch 30: loss=2.6019, train=0.3352, test=0.3078
  [BP] Epoch 40: loss=2.4193, train=0.3727, test=0.3158
  [BP] Epoch 50: loss=2.2631, train=0.4053, test=0.3160
  [BP] Epoch 60: loss=2.1134, train=0.4371, test=0.3223
  [BP] Epoch 70: loss=1.9686, train=0.4729, test=0.3207
  [BP] Epoch 80: loss=1.8724, train=0.4941, test=0.3197
  [BP] Epoch 90: loss=1.8161, train=0.5069, test=0.3197
  [BP] Epoch 100: loss=1.7897, train=0.5126, test=0.3192
  Final test acc: 0.3192

--- DFA ---
  [DFA] Epoch 1: loss=4.1736, train=0.0679, test=0.0775
  [DFA] Epoch 10: loss=4.0844, train=0.0798, test=0.0819
  [DFA] Epoch 20: loss=4.0627, train=0.0840, test=0.0759
  [DFA] Epoch 30: loss=4.0466, train=0.0876, test=0.0840
  [DFA] Epoch 40: loss=4.0357, train=0.0896, test=0.0862
  [DFA] Epoch 50: loss=4.0347, train=0.0909, test=0.0879
  [DFA] Epoch 60: loss=4.0298, train=0.0933, test=0.0879
  [DFA] Epoch 70: loss=4.0244, train=0.0958, test=0.0883
  [DFA] Epoch 80: loss=4.0232, train=0.0939, test=0.0871
  [DFA] Epoch 90: loss=4.0219, train=0.0962, test=0.0870
  [DFA] Epoch 100: loss=4.0244, train=0.0949, test=0.0875
  Final test acc: 0.0875

--- FA ---
  [FA] Epoch 1: loss=4.1842, train=0.0639, test=0.0598
  [FA] Epoch 10: loss=3.9551, train=0.0978, test=0.0949
  [FA] Epoch 20: loss=3.8745, train=0.1103, test=0.1101
  [FA] Epoch 30: loss=3.8457, train=0.1160, test=0.1212
  [FA] Epoch 40: loss=3.7975, train=0.1235, test=0.1247
  [FA] Epoch 50: loss=3.7623, train=0.1290, test=0.1332
  [FA] Epoch 60: loss=3.7338, train=0.1341, test=0.1397
  [FA] Epoch 70: loss=3.7109, train=0.1404, test=0.1400
  [FA] Epoch 80: loss=3.6910, train=0.1426, test=0.1457
  [FA] Epoch 90: loss=3.6844, train=0.1436, test=0.1455
  [FA] Epoch 100: loss=3.6859, train=0.1440, test=0.1464
  Final test acc: 0.1464

============================================================
Seed 123
============================================================

--- BP ---
  [BP] Epoch 1: loss=3.9679, train=0.0986, test=0.1439
  [BP] Epoch 10: loss=3.0754, train=0.2440, test=0.2501
  [BP] Epoch 20: loss=2.8025, train=0.2938, test=0.2812
  [BP] Epoch 30: loss=2.5874, train=0.3376, test=0.3021
  [BP] Epoch 40: loss=2.4113, train=0.3699, test=0.3104
  [BP] Epoch 50: loss=2.2468, train=0.4084, test=0.3160
  [BP] Epoch 60: loss=2.1034, train=0.4373, test=0.3209
  [BP] Epoch 70: loss=1.9664, train=0.4711, test=0.3212
  [BP] Epoch 80: loss=1.8659, train=0.4913, test=0.3208
  [BP] Epoch 90: loss=1.8143, train=0.5098, test=0.3201
  [BP] Epoch 100: loss=1.7758, train=0.5158, test=0.3218
  Final test acc: 0.3218

--- DFA ---
  [DFA] Epoch 1: loss=4.1790, train=0.0644, test=0.0808
  [DFA] Epoch 10: loss=4.1013, train=0.0738, test=0.0764
  [DFA] Epoch 20: loss=4.0720, train=0.0808, test=0.0803
  [DFA] Epoch 30: loss=4.0493, train=0.0865, test=0.0845
  [DFA] Epoch 40: loss=4.0403, train=0.0866, test=0.0855
  [DFA] Epoch 50: loss=4.0321, train=0.0897, test=0.0852
  [DFA] Epoch 60: loss=4.0243, train=0.0921, test=0.0856
  [DFA] Epoch 70: loss=4.0213, train=0.0924, test=0.0868
  [DFA] Epoch 80: loss=4.0207, train=0.0933, test=0.0867
  [DFA] Epoch 90: loss=4.0178, train=0.0948, test=0.0875
  [DFA] Epoch 100: loss=4.0181, train=0.0932, test=0.0872
  Final test acc: 0.0872

--- FA ---
  [FA] Epoch 1: loss=4.1971, train=0.0632, test=0.0708
  [FA] Epoch 10: loss=4.0477, train=0.0854, test=0.0847
  [FA] Epoch 20: loss=3.9867, train=0.0968, test=0.0997
  [FA] Epoch 30: loss=3.9504, train=0.1036, test=0.1037
  [FA] Epoch 40: loss=3.9204, train=0.1070, test=0.1068
  [FA] Epoch 50: loss=3.8915, train=0.1107, test=0.1091
  [FA] Epoch 60: loss=3.8680, train=0.1147, test=0.1135
  [FA] Epoch 70: loss=3.8517, train=0.1166, test=0.1156
  [FA] Epoch 80: loss=3.8433, train=0.1188, test=0.1182
  [FA] Epoch 90: loss=3.8342, train=0.1202, test=0.1215
  [FA] Epoch 100: loss=3.8330, train=0.1228, test=0.1208
  Final test acc: 0.1208

============================================================
Seed 456
============================================================

--- BP ---
  [BP] Epoch 1: loss=3.9722, train=0.0978, test=0.1436
  [BP] Epoch 10: loss=3.0679, train=0.2433, test=0.2496
  [BP] Epoch 20: loss=2.7902, train=0.2983, test=0.2857
  [BP] Epoch 30: loss=2.5920, train=0.3374, test=0.3018
  [BP] Epoch 40: loss=2.4046, train=0.3747, test=0.3166
  [BP] Epoch 50: loss=2.2421, train=0.4090, test=0.3165
  [BP] Epoch 60: loss=2.0908, train=0.4420, test=0.3204
  [BP] Epoch 70: loss=1.9548, train=0.4750, test=0.3202
  [BP] Epoch 80: loss=1.8580, train=0.4973, test=0.3177
  [BP] Epoch 90: loss=1.8029, train=0.5128, test=0.3217
  [BP] Epoch 100: loss=1.7769, train=0.5179, test=0.3219
  Final test acc: 0.3219

--- DFA ---
  [DFA] Epoch 1: loss=4.1619, train=0.0684, test=0.0832
  [DFA] Epoch 10: loss=4.0780, train=0.0790, test=0.0777
  [DFA] Epoch 20: loss=4.0602, train=0.0848, test=0.0813
  [DFA] Epoch 30: loss=4.0430, train=0.0885, test=0.0878
  [DFA] Epoch 40: loss=4.0391, train=0.0893, test=0.0872
  [DFA] Epoch 50: loss=4.0372, train=0.0914, test=0.0834
  [DFA] Epoch 60: loss=4.0358, train=0.0919, test=0.0884
  [DFA] Epoch 70: loss=4.0340, train=0.0928, test=0.0906
  [DFA] Epoch 80: loss=4.0334, train=0.0926, test=0.0879
  [DFA] Epoch 90: loss=4.0325, train=0.0935, test=0.0898
  [DFA] Epoch 100: loss=4.0329, train=0.0929, test=0.0894
  Final test acc: 0.0894

--- FA ---
  [FA] Epoch 1: loss=4.2178, train=0.0611, test=0.0534
  [FA] Epoch 10: loss=3.9339, train=0.1008, test=0.0999
  [FA] Epoch 20: loss=3.8903, train=0.1079, test=0.1125
  [FA] Epoch 30: loss=3.8439, train=0.1169, test=0.1138
  [FA] Epoch 40: loss=3.8094, train=0.1220, test=0.1228
  [FA] Epoch 50: loss=3.7933, train=0.1252, test=0.1240
  [FA] Epoch 60: loss=3.7808, train=0.1273, test=0.1275
  [FA] Epoch 70: loss=3.7675, train=0.1281, test=0.1252
  [FA] Epoch 80: loss=3.7592, train=0.1312, test=0.1307
  [FA] Epoch 90: loss=3.7554, train=0.1333, test=0.1311
  [FA] Epoch 100: loss=3.7508, train=0.1319, test=0.1310
  Final test acc: 0.1310

All results saved to results/cifar100_protocol_validation/results_cifar100.json

--- Frozen baseline on CIFAR-100 ---
  Frozen baseline seed=42 (Wed Apr 29 10:19:56 PM CDT 2026)
Device: cuda:0, seed=42, epochs=100, dataset=cifar100

=== BP shallow (ResMLP num_blocks=0), seed=42 ===
  n_params: 812900 (812900 trainable)
  [BP-shallow] ep 1: test_acc=0.1119
  [BP-shallow] ep 10: test_acc=0.1320
  [BP-shallow] ep 20: test_acc=0.1286
  [BP-shallow] ep 30: test_acc=0.1324
  [BP-shallow] ep 40: test_acc=0.1407
  [BP-shallow] ep 50: test_acc=0.1555
  [BP-shallow] ep 60: test_acc=0.1599
  [BP-shallow] ep 70: test_acc=0.1706
  [BP-shallow] ep 80: test_acc=0.1738
  [BP-shallow] ep 90: test_acc=0.1780
  [BP-shallow] ep 100: test_acc=0.1787
FINAL BP-shallow: 0.1787

=== BP frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=42 ===
  n_params: 1341284 (812900 trainable)
  [BP-frozen] ep 1: test_acc=0.1109
  [BP-frozen] ep 10: test_acc=0.1313
  [BP-frozen] ep 20: test_acc=0.1252
  [BP-frozen] ep 30: test_acc=0.1271
  [BP-frozen] ep 40: test_acc=0.1338
  [BP-frozen] ep 50: test_acc=0.1557
  [BP-frozen] ep 60: test_acc=0.1613
  [BP-frozen] ep 70: test_acc=0.1713
  [BP-frozen] ep 80: test_acc=0.1751
  [BP-frozen] ep 90: test_acc=0.1764
  [BP-frozen] ep 100: test_acc=0.1770
FINAL BP-frozen-blocks: 0.1770

=== DFA shallow (ResMLP num_blocks=0), seed=42 ===
  n_params: 812900 (812900 trainable)
  [DFA-shallow] ep 1: test_acc=0.0914
  [DFA-shallow] ep 10: test_acc=0.1120
  [DFA-shallow] ep 20: test_acc=0.1130
  [DFA-shallow] ep 30: test_acc=0.1198
  [DFA-shallow] ep 40: test_acc=0.1170
  [DFA-shallow] ep 50: test_acc=0.1211
  [DFA-shallow] ep 60: test_acc=0.1248
  [DFA-shallow] ep 70: test_acc=0.1203
  [DFA-shallow] ep 80: test_acc=0.1248
  [DFA-shallow] ep 90: test_acc=0.1254
  [DFA-shallow] ep 100: test_acc=0.1255
FINAL DFA-shallow: 0.1255

=== DFA frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=42 ===
  n_params: 1341284 (812900 trainable)
  [DFA-frozen] ep 1: test_acc=0.0920
  [DFA-frozen] ep 10: test_acc=0.1004
  [DFA-frozen] ep 20: test_acc=0.1171
  [DFA-frozen] ep 30: test_acc=0.1141
  [DFA-frozen] ep 40: test_acc=0.1207
  [DFA-frozen] ep 50: test_acc=0.1208
  [DFA-frozen] ep 60: test_acc=0.1204
  [DFA-frozen] ep 70: test_acc=0.1235
  [DFA-frozen] ep 80: test_acc=0.1243
  [DFA-frozen] ep 90: test_acc=0.1262
  [DFA-frozen] ep 100: test_acc=0.1256
FINAL DFA-frozen-blocks: 0.1256

=== ResMLP frozen/shallow baseline summary, seed=42 ===
  BP-shallow:    0.1787
  BP-frozen:     0.1770
  DFA-shallow:   0.1255
  DFA-frozen:    0.1256

Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep

Interpretation:
  If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
  If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
  Frozen baseline seed=123 (Wed Apr 29 10:59:17 PM CDT 2026)
Device: cuda:0, seed=123, epochs=100, dataset=cifar100

=== BP shallow (ResMLP num_blocks=0), seed=123 ===
  n_params: 812900 (812900 trainable)
  [BP-shallow] ep 1: test_acc=0.1098
  [BP-shallow] ep 10: test_acc=0.1309
  [BP-shallow] ep 20: test_acc=0.1203
  [BP-shallow] ep 30: test_acc=0.1262
  [BP-shallow] ep 40: test_acc=0.1415
  [BP-shallow] ep 50: test_acc=0.1532
  [BP-shallow] ep 60: test_acc=0.1622
  [BP-shallow] ep 70: test_acc=0.1725
  [BP-shallow] ep 80: test_acc=0.1751
  [BP-shallow] ep 90: test_acc=0.1745
  [BP-shallow] ep 100: test_acc=0.1756
FINAL BP-shallow: 0.1756

=== BP frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=123 ===
  n_params: 1341284 (812900 trainable)
  [BP-frozen] ep 1: test_acc=0.1100
  [BP-frozen] ep 10: test_acc=0.1328
  [BP-frozen] ep 20: test_acc=0.1256
  [BP-frozen] ep 30: test_acc=0.1333
  [BP-frozen] ep 40: test_acc=0.1411
  [BP-frozen] ep 50: test_acc=0.1596
  [BP-frozen] ep 60: test_acc=0.1638
  [BP-frozen] ep 70: test_acc=0.1720
  [BP-frozen] ep 80: test_acc=0.1737
  [BP-frozen] ep 90: test_acc=0.1769
  [BP-frozen] ep 100: test_acc=0.1777
FINAL BP-frozen-blocks: 0.1777

=== DFA shallow (ResMLP num_blocks=0), seed=123 ===
  n_params: 812900 (812900 trainable)
  [DFA-shallow] ep 1: test_acc=0.0928
  [DFA-shallow] ep 10: test_acc=0.1025
  [DFA-shallow] ep 20: test_acc=0.1146
  [DFA-shallow] ep 30: test_acc=0.1180
  [DFA-shallow] ep 40: test_acc=0.1239
  [DFA-shallow] ep 50: test_acc=0.1283
  [DFA-shallow] ep 60: test_acc=0.1204
  [DFA-shallow] ep 70: test_acc=0.1237
  [DFA-shallow] ep 80: test_acc=0.1261
  [DFA-shallow] ep 90: test_acc=0.1261
  [DFA-shallow] ep 100: test_acc=0.1269
FINAL DFA-shallow: 0.1269

=== DFA frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=123 ===
  n_params: 1341284 (812900 trainable)
  [DFA-frozen] ep 1: test_acc=0.0916
  [DFA-frozen] ep 10: test_acc=0.1060
  [DFA-frozen] ep 20: test_acc=0.1167
  [DFA-frozen] ep 30: test_acc=0.1125
  [DFA-frozen] ep 40: test_acc=0.1153
  [DFA-frozen] ep 50: test_acc=0.1237
  [DFA-frozen] ep 60: test_acc=0.1217
  [DFA-frozen] ep 70: test_acc=0.1254
  [DFA-frozen] ep 80: test_acc=0.1239
  [DFA-frozen] ep 90: test_acc=0.1254
  [DFA-frozen] ep 100: test_acc=0.1257
FINAL DFA-frozen-blocks: 0.1257

=== ResMLP frozen/shallow baseline summary, seed=123 ===
  BP-shallow:    0.1756
  BP-frozen:     0.1777
  DFA-shallow:   0.1269
  DFA-frozen:    0.1257

Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep

Interpretation:
  If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
  If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
  Frozen baseline seed=456 (Wed Apr 29 11:38:34 PM CDT 2026)
Device: cuda:0, seed=456, epochs=100, dataset=cifar100

=== BP shallow (ResMLP num_blocks=0), seed=456 ===
  n_params: 812900 (812900 trainable)
  [BP-shallow] ep 1: test_acc=0.1073
  [BP-shallow] ep 10: test_acc=0.1327
  [BP-shallow] ep 20: test_acc=0.1250
  [BP-shallow] ep 30: test_acc=0.1303
  [BP-shallow] ep 40: test_acc=0.1411
  [BP-shallow] ep 50: test_acc=0.1529
  [BP-shallow] ep 60: test_acc=0.1651
  [BP-shallow] ep 70: test_acc=0.1724
  [BP-shallow] ep 80: test_acc=0.1743
  [BP-shallow] ep 90: test_acc=0.1757
  [BP-shallow] ep 100: test_acc=0.1776
FINAL BP-shallow: 0.1776

=== BP frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=456 ===
  n_params: 1341284 (812900 trainable)
  [BP-frozen] ep 1: test_acc=0.1073
  [BP-frozen] ep 10: test_acc=0.1326
  [BP-frozen] ep 20: test_acc=0.1226
  [BP-frozen] ep 30: test_acc=0.1276
  [BP-frozen] ep 40: test_acc=0.1495
  [BP-frozen] ep 50: test_acc=0.1535
  [BP-frozen] ep 60: test_acc=0.1645
  [BP-frozen] ep 70: test_acc=0.1685
  [BP-frozen] ep 80: test_acc=0.1773
  [BP-frozen] ep 90: test_acc=0.1777
  [BP-frozen] ep 100: test_acc=0.1794
FINAL BP-frozen-blocks: 0.1794

=== DFA shallow (ResMLP num_blocks=0), seed=456 ===
  n_params: 812900 (812900 trainable)
  [DFA-shallow] ep 1: test_acc=0.0913
  [DFA-shallow] ep 10: test_acc=0.1150
  [DFA-shallow] ep 20: test_acc=0.1153
  [DFA-shallow] ep 30: test_acc=0.1161
  [DFA-shallow] ep 40: test_acc=0.1140
  [DFA-shallow] ep 50: test_acc=0.1202
  [DFA-shallow] ep 60: test_acc=0.1229
  [DFA-shallow] ep 70: test_acc=0.1243
  [DFA-shallow] ep 80: test_acc=0.1240
  [DFA-shallow] ep 90: test_acc=0.1237
  [DFA-shallow] ep 100: test_acc=0.1235
FINAL DFA-shallow: 0.1235

=== DFA frozen-blocks (ResMLP num_blocks=4, blocks frozen), seed=456 ===
  n_params: 1341284 (812900 trainable)
  [DFA-frozen] ep 1: test_acc=0.0862
  [DFA-frozen] ep 10: test_acc=0.1059
  [DFA-frozen] ep 20: test_acc=0.1130
  [DFA-frozen] ep 30: test_acc=0.1200
  [DFA-frozen] ep 40: test_acc=0.1227
  [DFA-frozen] ep 50: test_acc=0.1177
  [DFA-frozen] ep 60: test_acc=0.1229
  [DFA-frozen] ep 70: test_acc=0.1232
  [DFA-frozen] ep 80: test_acc=0.1240
  [DFA-frozen] ep 90: test_acc=0.1237
  [DFA-frozen] ep 100: test_acc=0.1236
FINAL DFA-frozen-blocks: 0.1236

=== ResMLP frozen/shallow baseline summary, seed=456 ===
  BP-shallow:    0.1776
  BP-frozen:     0.1794
  DFA-shallow:   0.1235
  DFA-frozen:    0.1236

Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep

Interpretation:
  If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
  If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)

=== CIFAR-100 VALIDATION DONE (Thu Apr 30 12:17:41 AM CDT 2026) ===