blob: ba195298d330bbde8d07323be11bae3287d8eae4 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
|
ViT-MINI: depth=4, d_model=128, n_heads=4, epochs=60, seed=456
=== BP training (ViT-Mini) ===
n_params=809354
[BP-vit] Ep 0: ||h_L_cls||=6.653e+00 ||g_2||=1.039e-03 acc=0.0791
[BP-vit] Ep 1: ||h_L_cls||=2.995e+01 ||g_2||=1.710e-04 acc=0.4590
[BP-vit] Ep 5: ||h_L_cls||=3.062e+01 ||g_2||=1.845e-04 acc=0.6113
[BP-vit] Ep 10: ||h_L_cls||=3.123e+01 ||g_2||=1.730e-04 acc=0.6709
[BP-vit] Ep 15: ||h_L_cls||=2.697e+01 ||g_2||=1.889e-04 acc=0.7012
[BP-vit] Ep 20: ||h_L_cls||=2.387e+01 ||g_2||=1.801e-04 acc=0.7295
[BP-vit] Ep 25: ||h_L_cls||=2.080e+01 ||g_2||=1.754e-04 acc=0.7578
[BP-vit] Ep 30: ||h_L_cls||=1.773e+01 ||g_2||=1.553e-04 acc=0.7666
[BP-vit] Ep 35: ||h_L_cls||=1.553e+01 ||g_2||=1.576e-04 acc=0.7705
[BP-vit] Ep 40: ||h_L_cls||=1.420e+01 ||g_2||=1.193e-04 acc=0.7812
[BP-vit] Ep 45: ||h_L_cls||=1.271e+01 ||g_2||=9.615e-05 acc=0.7773
[BP-vit] Ep 50: ||h_L_cls||=1.230e+01 ||g_2||=7.114e-05 acc=0.8008
[BP-vit] Ep 55: ||h_L_cls||=1.201e+01 ||g_2||=6.104e-05 acc=0.7920
[BP-vit] Ep 60: ||h_L_cls||=1.197e+01 ||g_2||=5.866e-05 acc=0.7910
=== DFA training (ViT-Mini, block-level DFA) ===
[DFA-vit] Ep 0: ||h_L_cls||=6.653e+00 ||g_2||=1.039e-03 acc=0.0791
[DFA-vit] Ep 1: ||h_L_cls||=6.750e+03 ||g_2||=9.114e-07 acc=0.2334 γ=0.0073
[DFA-vit] Ep 5: ||h_L_cls||=2.891e+05 ||g_2||=3.080e-08 acc=0.1963 γ=0.0068
[DFA-vit] Ep 10: ||h_L_cls||=1.709e+06 ||g_2||=4.513e-09 acc=0.1973 γ=0.0061
[DFA-vit] Ep 15: ||h_L_cls||=5.106e+06 ||g_2||=1.561e-09 acc=0.2363 γ=0.0023
[DFA-vit] Ep 20: ||h_L_cls||=1.160e+07 ||g_2||=6.526e-10 acc=0.2559 γ=0.0012
[DFA-vit] Ep 25: ||h_L_cls||=2.239e+07 ||g_2||=4.283e-10 acc=0.2568 γ=0.0006
[DFA-vit] Ep 30: ||h_L_cls||=3.290e+07 ||g_2||=3.264e-10 acc=0.2656 γ=0.0005
[DFA-vit] Ep 35: ||h_L_cls||=4.443e+07 ||g_2||=3.018e-10 acc=0.2354 γ=0.0008
[DFA-vit] Ep 40: ||h_L_cls||=5.315e+07 ||g_2||=2.841e-10 acc=0.2559 γ=0.0004
[DFA-vit] Ep 45: ||h_L_cls||=5.912e+07 ||g_2||=3.038e-10 acc=0.2441 γ=0.0004
[DFA-vit] Ep 50: ||h_L_cls||=6.210e+07 ||g_2||=3.006e-10 acc=0.2578 γ=0.0002
[DFA-vit] Ep 55: ||h_L_cls||=6.344e+07 ||g_2||=3.079e-10 acc=0.2529 γ=0.0001
[DFA-vit] Ep 60: ||h_L_cls||=6.367e+07 ||g_2||=3.069e-10 acc=0.2529 γ=0.0000
Saved results/snapshot_vit_v1/snapshot_vit_s456.json
|