ViT-MINI: depth=4, d_model=128, n_heads=4, epochs=60, seed=456 === BP training (ViT-Mini) === n_params=809354 [BP-vit] Ep 0: ||h_L_cls||=6.653e+00 ||g_2||=1.039e-03 acc=0.0791 [BP-vit] Ep 1: ||h_L_cls||=2.995e+01 ||g_2||=1.710e-04 acc=0.4590 [BP-vit] Ep 5: ||h_L_cls||=3.062e+01 ||g_2||=1.845e-04 acc=0.6113 [BP-vit] Ep 10: ||h_L_cls||=3.123e+01 ||g_2||=1.730e-04 acc=0.6709 [BP-vit] Ep 15: ||h_L_cls||=2.697e+01 ||g_2||=1.889e-04 acc=0.7012 [BP-vit] Ep 20: ||h_L_cls||=2.387e+01 ||g_2||=1.801e-04 acc=0.7295 [BP-vit] Ep 25: ||h_L_cls||=2.080e+01 ||g_2||=1.754e-04 acc=0.7578 [BP-vit] Ep 30: ||h_L_cls||=1.773e+01 ||g_2||=1.553e-04 acc=0.7666 [BP-vit] Ep 35: ||h_L_cls||=1.553e+01 ||g_2||=1.576e-04 acc=0.7705 [BP-vit] Ep 40: ||h_L_cls||=1.420e+01 ||g_2||=1.193e-04 acc=0.7812 [BP-vit] Ep 45: ||h_L_cls||=1.271e+01 ||g_2||=9.615e-05 acc=0.7773 [BP-vit] Ep 50: ||h_L_cls||=1.230e+01 ||g_2||=7.114e-05 acc=0.8008 [BP-vit] Ep 55: ||h_L_cls||=1.201e+01 ||g_2||=6.104e-05 acc=0.7920 [BP-vit] Ep 60: ||h_L_cls||=1.197e+01 ||g_2||=5.866e-05 acc=0.7910 === DFA training (ViT-Mini, block-level DFA) === [DFA-vit] Ep 0: ||h_L_cls||=6.653e+00 ||g_2||=1.039e-03 acc=0.0791 [DFA-vit] Ep 1: ||h_L_cls||=6.750e+03 ||g_2||=9.114e-07 acc=0.2334 γ=0.0073 [DFA-vit] Ep 5: ||h_L_cls||=2.891e+05 ||g_2||=3.080e-08 acc=0.1963 γ=0.0068 [DFA-vit] Ep 10: ||h_L_cls||=1.709e+06 ||g_2||=4.513e-09 acc=0.1973 γ=0.0061 [DFA-vit] Ep 15: ||h_L_cls||=5.106e+06 ||g_2||=1.561e-09 acc=0.2363 γ=0.0023 [DFA-vit] Ep 20: ||h_L_cls||=1.160e+07 ||g_2||=6.526e-10 acc=0.2559 γ=0.0012 [DFA-vit] Ep 25: ||h_L_cls||=2.239e+07 ||g_2||=4.283e-10 acc=0.2568 γ=0.0006 [DFA-vit] Ep 30: ||h_L_cls||=3.290e+07 ||g_2||=3.264e-10 acc=0.2656 γ=0.0005 [DFA-vit] Ep 35: ||h_L_cls||=4.443e+07 ||g_2||=3.018e-10 acc=0.2354 γ=0.0008 [DFA-vit] Ep 40: ||h_L_cls||=5.315e+07 ||g_2||=2.841e-10 acc=0.2559 γ=0.0004 [DFA-vit] Ep 45: ||h_L_cls||=5.912e+07 ||g_2||=3.038e-10 acc=0.2441 γ=0.0004 [DFA-vit] Ep 50: ||h_L_cls||=6.210e+07 ||g_2||=3.006e-10 acc=0.2578 γ=0.0002 [DFA-vit] Ep 55: ||h_L_cls||=6.344e+07 ||g_2||=3.079e-10 acc=0.2529 γ=0.0001 [DFA-vit] Ep 60: ||h_L_cls||=6.367e+07 ||g_2||=3.069e-10 acc=0.2529 γ=0.0000 Saved results/snapshot_vit_v1/snapshot_vit_s456.json