diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-06-14 20:32:31 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-06-14 20:32:31 -0500 |
| commit | 1118b7457c261de36ead6103503c00c321c75f9b (patch) | |
| tree | 7ea76b32f070cb58458caaa2897a5d8133561f48 /logs/depth_ladder.log | |
| parent | aa73718eb6427d7da3b9cb416275802d90c4b2ed (diff) | |
Appendix experiment triangulating the depth-utility diagnostic (D3) by varying
the number of trainable residual blocks k (last-k trainable, first L-k frozen at
init; embed/LN/head always trained).
- d=256 L=4 and d=512 L=2, 3 seeds, recipe identical to the main audit.
- BP climbs monotonically (+22-23pp); DFA peaks at the frozen baseline (k=0) and
declines once any deep block is trained; FA shows partial/no net depth utility.
- Cross-checks reproduce existing anchors (BP 0.617, DFA 0.301, FA 0.402, frozen 0.349).
- frozen_init_identity_check quantifies frozen stack as a near-norm-preserving
random feature map (per-block ||f||/||h||~0.10, stack cos 0.981), explaining the
above-chance k=0 rung.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'logs/depth_ladder.log')
| -rw-r--r-- | logs/depth_ladder.log | 1103 |
1 files changed, 1103 insertions, 0 deletions
diff --git a/logs/depth_ladder.log b/logs/depth_ladder.log new file mode 100644 index 0000000..20af1ab --- /dev/null +++ b/logs/depth_ladder.log @@ -0,0 +1,1103 @@ +[Sun Jun 14 11:29:47 AM CDT 2026] START primary d=256 L=4 ladder +Device=cuda:0 ladder_d256_L4_cifar10 methods=['bp', 'fa', 'dfa'] k=[0, 1, 2, 3, 4] seeds=[42, 123, 456] epochs=100 + +=== BP k=0 (last 0 of 4 trainable) seed=42 === + trainable blocks: [] trainable params: 789,770 + [BP k] ep 1: test=0.3543 + [BP k] ep 10: test=0.3673 + [BP k] ep 20: test=0.3483 + [BP k] ep 30: test=0.3498 + [BP k] ep 40: test=0.3608 + [BP k] ep 50: test=0.3627 + [BP k] ep 60: test=0.3697 + [BP k] ep 70: test=0.3803 + [BP k] ep 80: test=0.3821 + [BP k] ep 90: test=0.3870 + [BP k] ep 100: test=0.3882 + FINAL bp k=0 seed=42: 0.3882 + +=== BP k=0 (last 0 of 4 trainable) seed=123 === + trainable blocks: [] trainable params: 789,770 + [BP k] ep 1: test=0.3535 + [BP k] ep 10: test=0.3654 + [BP k] ep 20: test=0.3612 + [BP k] ep 30: test=0.3586 + [BP k] ep 40: test=0.3633 + [BP k] ep 50: test=0.3608 + [BP k] ep 60: test=0.3772 + [BP k] ep 70: test=0.3791 + [BP k] ep 80: test=0.3897 + [BP k] ep 90: test=0.3884 + [BP k] ep 100: test=0.3899 + FINAL bp k=0 seed=123: 0.3899 + +=== BP k=0 (last 0 of 4 trainable) seed=456 === + trainable blocks: [] trainable params: 789,770 + [BP k] ep 1: test=0.3551 + [BP k] ep 10: test=0.3680 + [BP k] ep 20: test=0.3509 + [BP k] ep 30: test=0.3655 + [BP k] ep 40: test=0.3573 + [BP k] ep 50: test=0.3543 + [BP k] ep 60: test=0.3716 + [BP k] ep 70: test=0.3824 + [BP k] ep 80: test=0.3852 + [BP k] ep 90: test=0.3891 + [BP k] ep 100: test=0.3878 + FINAL bp k=0 seed=456: 0.3878 + +=== BP k=1 (last 1 of 4 trainable) seed=42 === + trainable blocks: [3] trainable params: 921,866 + [BP k] ep 1: test=0.3736 + [BP k] ep 10: test=0.4890 + [BP k] ep 20: test=0.5089 + [BP k] ep 30: test=0.5260 + [BP k] ep 40: test=0.5365 + [BP k] ep 50: test=0.5486 + [BP k] ep 60: test=0.5524 + [BP k] ep 70: test=0.5638 + [BP k] ep 80: test=0.5666 + [BP k] ep 90: test=0.5678 + [BP k] ep 100: test=0.5683 + FINAL bp k=1 seed=42: 0.5683 + +=== BP k=1 (last 1 of 4 trainable) seed=123 === + trainable blocks: [3] trainable params: 921,866 + [BP k] ep 1: test=0.3878 + [BP k] ep 10: test=0.4797 + [BP k] ep 20: test=0.5096 + [BP k] ep 30: test=0.5209 + [BP k] ep 40: test=0.5280 + [BP k] ep 50: test=0.5486 + [BP k] ep 60: test=0.5530 + [BP k] ep 70: test=0.5564 + [BP k] ep 80: test=0.5609 + [BP k] ep 90: test=0.5611 + [BP k] ep 100: test=0.5623 + FINAL bp k=1 seed=123: 0.5623 + +=== BP k=1 (last 1 of 4 trainable) seed=456 === + trainable blocks: [3] trainable params: 921,866 + [BP k] ep 1: test=0.3772 + [BP k] ep 10: test=0.4853 + [BP k] ep 20: test=0.5098 + [BP k] ep 30: test=0.5238 + [BP k] ep 40: test=0.5387 + [BP k] ep 50: test=0.5488 + [BP k] ep 60: test=0.5547 + [BP k] ep 70: test=0.5588 + [BP k] ep 80: test=0.5636 + [BP k] ep 90: test=0.5637 + [BP k] ep 100: test=0.5643 + FINAL bp k=1 seed=456: 0.5643 + +=== BP k=2 (last 2 of 4 trainable) seed=42 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [BP k] ep 1: test=0.3874 + [BP k] ep 10: test=0.5157 + [BP k] ep 20: test=0.5361 + [BP k] ep 30: test=0.5600 + [BP k] ep 40: test=0.5753 + [BP k] ep 50: test=0.5802 + [BP k] ep 60: test=0.5843 + [BP k] ep 70: test=0.5965 + [BP k] ep 80: test=0.5970 + [BP k] ep 90: test=0.5979 + [BP k] ep 100: test=0.5994 + FINAL bp k=2 seed=42: 0.5994 + +=== BP k=2 (last 2 of 4 trainable) seed=123 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [BP k] ep 1: test=0.3925 + [BP k] ep 10: test=0.5148 + [BP k] ep 20: test=0.5376 + [BP k] ep 30: test=0.5638 + [BP k] ep 40: test=0.5693 + [BP k] ep 50: test=0.5784 + [BP k] ep 60: test=0.5927 + [BP k] ep 70: test=0.5911 + [BP k] ep 80: test=0.5973 + [BP k] ep 90: test=0.5986 + [BP k] ep 100: test=0.6000 + FINAL bp k=2 seed=123: 0.6000 + +=== BP k=2 (last 2 of 4 trainable) seed=456 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [BP k] ep 1: test=0.3868 + [BP k] ep 10: test=0.5103 + [BP k] ep 20: test=0.5420 + [BP k] ep 30: test=0.5610 + [BP k] ep 40: test=0.5699 + [BP k] ep 50: test=0.5789 + [BP k] ep 60: test=0.5809 + [BP k] ep 70: test=0.5844 + [BP k] ep 80: test=0.5919 + [BP k] ep 90: test=0.5919 + [BP k] ep 100: test=0.5939 + FINAL bp k=2 seed=456: 0.5939 + +=== BP k=3 (last 3 of 4 trainable) seed=42 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [BP k] ep 1: test=0.3904 + [BP k] ep 10: test=0.5218 + [BP k] ep 20: test=0.5469 + [BP k] ep 30: test=0.5749 + [BP k] ep 40: test=0.5935 + [BP k] ep 50: test=0.5950 + [BP k] ep 60: test=0.5983 + [BP k] ep 70: test=0.6015 + [BP k] ep 80: test=0.6070 + [BP k] ep 90: test=0.6057 + [BP k] ep 100: test=0.6079 + FINAL bp k=3 seed=42: 0.6079 + +=== BP k=3 (last 3 of 4 trainable) seed=123 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [BP k] ep 1: test=0.3965 + [BP k] ep 10: test=0.5240 + [BP k] ep 20: test=0.5517 + [BP k] ep 30: test=0.5747 + [BP k] ep 40: test=0.5774 + [BP k] ep 50: test=0.5927 + [BP k] ep 60: test=0.6035 + [BP k] ep 70: test=0.6030 + [BP k] ep 80: test=0.6057 + [BP k] ep 90: test=0.6073 + [BP k] ep 100: test=0.6069 + FINAL bp k=3 seed=123: 0.6069 + +=== BP k=3 (last 3 of 4 trainable) seed=456 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [BP k] ep 1: test=0.3947 + [BP k] ep 10: test=0.5148 + [BP k] ep 20: test=0.5536 + [BP k] ep 30: test=0.5723 + [BP k] ep 40: test=0.5873 + [BP k] ep 50: test=0.5861 + [BP k] ep 60: test=0.5991 + [BP k] ep 70: test=0.5989 + [BP k] ep 80: test=0.6062 + [BP k] ep 90: test=0.6093 + [BP k] ep 100: test=0.6080 + FINAL bp k=3 seed=456: 0.6080 + +=== BP k=4 (last 4 of 4 trainable) seed=42 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [BP k] ep 1: test=0.3936 + [BP k] ep 10: test=0.5235 + [BP k] ep 20: test=0.5606 + [BP k] ep 30: test=0.5794 + [BP k] ep 40: test=0.5992 + [BP k] ep 50: test=0.6044 + [BP k] ep 60: test=0.5979 + [BP k] ep 70: test=0.6115 + [BP k] ep 80: test=0.6153 + [BP k] ep 90: test=0.6177 + [BP k] ep 100: test=0.6173 + FINAL bp k=4 seed=42: 0.6173 + +=== BP k=4 (last 4 of 4 trainable) seed=123 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [BP k] ep 1: test=0.3981 + [BP k] ep 10: test=0.5257 + [BP k] ep 20: test=0.5580 + [BP k] ep 30: test=0.5779 + [BP k] ep 40: test=0.5896 + [BP k] ep 50: test=0.6023 + [BP k] ep 60: test=0.6053 + [BP k] ep 70: test=0.6081 + [BP k] ep 80: test=0.6185 + [BP k] ep 90: test=0.6174 + [BP k] ep 100: test=0.6182 + FINAL bp k=4 seed=123: 0.6182 + +=== BP k=4 (last 4 of 4 trainable) seed=456 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [BP k] ep 1: test=0.3967 + [BP k] ep 10: test=0.5255 + [BP k] ep 20: test=0.5632 + [BP k] ep 30: test=0.5747 + [BP k] ep 40: test=0.5948 + [BP k] ep 50: test=0.5954 + [BP k] ep 60: test=0.6092 + [BP k] ep 70: test=0.6140 + [BP k] ep 80: test=0.6125 + [BP k] ep 90: test=0.6145 + [BP k] ep 100: test=0.6145 + FINAL bp k=4 seed=456: 0.6145 + +=== FA k=0 (last 0 of 4 trainable) seed=42 === + trainable blocks: [] trainable params: 789,770 + [FA k] ep 1: test=0.3112 + [FA k] ep 10: test=0.3389 + [FA k] ep 20: test=0.3325 + [FA k] ep 30: test=0.3495 + [FA k] ep 40: test=0.3467 + [FA k] ep 50: test=0.3465 + [FA k] ep 60: test=0.3573 + [FA k] ep 70: test=0.3542 + [FA k] ep 80: test=0.3567 + [FA k] ep 90: test=0.3554 + [FA k] ep 100: test=0.3555 + FINAL fa k=0 seed=42: 0.3555 + +=== FA k=0 (last 0 of 4 trainable) seed=123 === + trainable blocks: [] trainable params: 789,770 + [FA k] ep 1: test=0.3257 + [FA k] ep 10: test=0.3409 + [FA k] ep 20: test=0.3514 + [FA k] ep 30: test=0.3357 + [FA k] ep 40: test=0.3299 + [FA k] ep 50: test=0.3495 + [FA k] ep 60: test=0.3468 + [FA k] ep 70: test=0.3548 + [FA k] ep 80: test=0.3509 + [FA k] ep 90: test=0.3536 + [FA k] ep 100: test=0.3520 + FINAL fa k=0 seed=123: 0.3520 + +=== FA k=0 (last 0 of 4 trainable) seed=456 === + trainable blocks: [] trainable params: 789,770 + [FA k] ep 1: test=0.3172 + [FA k] ep 10: test=0.3374 + [FA k] ep 20: test=0.3452 + [FA k] ep 30: test=0.3431 + [FA k] ep 40: test=0.3468 + [FA k] ep 50: test=0.3563 + [FA k] ep 60: test=0.3523 + [FA k] ep 70: test=0.3578 + [FA k] ep 80: test=0.3568 + [FA k] ep 90: test=0.3576 + [FA k] ep 100: test=0.3578 + FINAL fa k=0 seed=456: 0.3578 + +=== FA k=1 (last 1 of 4 trainable) seed=42 === + trainable blocks: [3] trainable params: 921,866 + [FA k] ep 1: test=0.2886 + [FA k] ep 10: test=0.3301 + [FA k] ep 20: test=0.3604 + [FA k] ep 30: test=0.3595 + [FA k] ep 40: test=0.3678 + [FA k] ep 50: test=0.3779 + [FA k] ep 60: test=0.3727 + [FA k] ep 70: test=0.3810 + [FA k] ep 80: test=0.3810 + [FA k] ep 90: test=0.3821 + [FA k] ep 100: test=0.3819 + FINAL fa k=1 seed=42: 0.3819 + +=== FA k=1 (last 1 of 4 trainable) seed=123 === + trainable blocks: [3] trainable params: 921,866 + [FA k] ep 1: test=0.3105 + [FA k] ep 10: test=0.3472 + [FA k] ep 20: test=0.3444 + [FA k] ep 30: test=0.3604 + [FA k] ep 40: test=0.3615 + [FA k] ep 50: test=0.3568 + [FA k] ep 60: test=0.3708 + [FA k] ep 70: test=0.3723 + [FA k] ep 80: test=0.3749 + [FA k] ep 90: test=0.3736 + [FA k] ep 100: test=0.3742 + FINAL fa k=1 seed=123: 0.3742 + +=== FA k=1 (last 1 of 4 trainable) seed=456 === + trainable blocks: [3] trainable params: 921,866 + [FA k] ep 1: test=0.2975 + [FA k] ep 10: test=0.3481 + [FA k] ep 20: test=0.3454 + [FA k] ep 30: test=0.3683 + [FA k] ep 40: test=0.3618 + [FA k] ep 50: test=0.3675 + [FA k] ep 60: test=0.3826 + [FA k] ep 70: test=0.3867 + [FA k] ep 80: test=0.3863 + [FA k] ep 90: test=0.3899 + [FA k] ep 100: test=0.3898 + FINAL fa k=1 seed=456: 0.3898 + +=== FA k=2 (last 2 of 4 trainable) seed=42 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [FA k] ep 1: test=0.2657 + [FA k] ep 10: test=0.3431 + [FA k] ep 20: test=0.3494 + [FA k] ep 30: test=0.3436 + [FA k] ep 40: test=0.3574 + [FA k] ep 50: test=0.3388 + [FA k] ep 60: test=0.3426 + [FA k] ep 70: test=0.3341 + [FA k] ep 80: test=0.3303 + [FA k] ep 90: test=0.3310 + [FA k] ep 100: test=0.3305 + FINAL fa k=2 seed=42: 0.3305 + +=== FA k=2 (last 2 of 4 trainable) seed=123 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [FA k] ep 1: test=0.2982 + [FA k] ep 10: test=0.3524 + [FA k] ep 20: test=0.3694 + [FA k] ep 30: test=0.3691 + [FA k] ep 40: test=0.3703 + [FA k] ep 50: test=0.3605 + [FA k] ep 60: test=0.3546 + [FA k] ep 70: test=0.3547 + [FA k] ep 80: test=0.3651 + [FA k] ep 90: test=0.3565 + [FA k] ep 100: test=0.3607 + FINAL fa k=2 seed=123: 0.3607 + +=== FA k=2 (last 2 of 4 trainable) seed=456 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [FA k] ep 1: test=0.2753 + [FA k] ep 10: test=0.3386 + [FA k] ep 20: test=0.3495 + [FA k] ep 30: test=0.3458 + [FA k] ep 40: test=0.3374 + [FA k] ep 50: test=0.3333 + [FA k] ep 60: test=0.3523 + [FA k] ep 70: test=0.3538 + [FA k] ep 80: test=0.3519 + [FA k] ep 90: test=0.3555 + [FA k] ep 100: test=0.3548 + FINAL fa k=2 seed=456: 0.3548 + +=== FA k=3 (last 3 of 4 trainable) seed=42 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [FA k] ep 1: test=0.2770 + [FA k] ep 10: test=0.3554 + [FA k] ep 20: test=0.3681 + [FA k] ep 30: test=0.3841 + [FA k] ep 40: test=0.3829 + [FA k] ep 50: test=0.3847 + [FA k] ep 60: test=0.3885 + [FA k] ep 70: test=0.3956 + [FA k] ep 80: test=0.3947 + [FA k] ep 90: test=0.3916 + [FA k] ep 100: test=0.3930 + FINAL fa k=3 seed=42: 0.3930 + +=== FA k=3 (last 3 of 4 trainable) seed=123 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [FA k] ep 1: test=0.2905 + [FA k] ep 10: test=0.3495 + [FA k] ep 20: test=0.3804 + [FA k] ep 30: test=0.3820 + [FA k] ep 40: test=0.3885 + [FA k] ep 50: test=0.3950 + [FA k] ep 60: test=0.3971 + [FA k] ep 70: test=0.4049 + [FA k] ep 80: test=0.4047 + [FA k] ep 90: test=0.4075 + [FA k] ep 100: test=0.4074 + FINAL fa k=3 seed=123: 0.4074 + +=== FA k=3 (last 3 of 4 trainable) seed=456 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [FA k] ep 1: test=0.2708 + [FA k] ep 10: test=0.3511 + [FA k] ep 20: test=0.3662 + [FA k] ep 30: test=0.3755 + [FA k] ep 40: test=0.3818 + [FA k] ep 50: test=0.3828 + [FA k] ep 60: test=0.3966 + [FA k] ep 70: test=0.3939 + [FA k] ep 80: test=0.3928 + [FA k] ep 90: test=0.3933 + [FA k] ep 100: test=0.3946 + FINAL fa k=3 seed=456: 0.3946 + +=== FA k=4 (last 4 of 4 trainable) seed=42 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [FA k] ep 1: test=0.2789 + [FA k] ep 10: test=0.3498 + [FA k] ep 20: test=0.3601 + [FA k] ep 30: test=0.3710 + [FA k] ep 40: test=0.3834 + [FA k] ep 50: test=0.3923 + [FA k] ep 60: test=0.3912 + [FA k] ep 70: test=0.3945 + [FA k] ep 80: test=0.3957 + [FA k] ep 90: test=0.3944 + [FA k] ep 100: test=0.3959 + FINAL fa k=4 seed=42: 0.3959 + +=== FA k=4 (last 4 of 4 trainable) seed=123 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [FA k] ep 1: test=0.2905 + [FA k] ep 10: test=0.3596 + [FA k] ep 20: test=0.3803 + [FA k] ep 30: test=0.3792 + [FA k] ep 40: test=0.3955 + [FA k] ep 50: test=0.3980 + [FA k] ep 60: test=0.4071 + [FA k] ep 70: test=0.4034 + [FA k] ep 80: test=0.4076 + [FA k] ep 90: test=0.4115 + [FA k] ep 100: test=0.4122 + FINAL fa k=4 seed=123: 0.4122 + +=== FA k=4 (last 4 of 4 trainable) seed=456 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [FA k] ep 1: test=0.2713 + [FA k] ep 10: test=0.3544 + [FA k] ep 20: test=0.3702 + [FA k] ep 30: test=0.3799 + [FA k] ep 40: test=0.3845 + [FA k] ep 50: test=0.3923 + [FA k] ep 60: test=0.3992 + [FA k] ep 70: test=0.3974 + [FA k] ep 80: test=0.3990 + [FA k] ep 90: test=0.4000 + [FA k] ep 100: test=0.3987 + FINAL fa k=4 seed=456: 0.3987 + +=== DFA k=0 (last 0 of 4 trainable) seed=42 === + trainable blocks: [] trainable params: 789,770 + [DFA k] ep 1: test=0.3185 + [DFA k] ep 10: test=0.3370 + [DFA k] ep 20: test=0.3458 + [DFA k] ep 30: test=0.3425 + [DFA k] ep 40: test=0.3419 + [DFA k] ep 50: test=0.3425 + [DFA k] ep 60: test=0.3420 + [DFA k] ep 70: test=0.3466 + [DFA k] ep 80: test=0.3458 + [DFA k] ep 90: test=0.3470 + [DFA k] ep 100: test=0.3454 + FINAL dfa k=0 seed=42: 0.3454 + +=== DFA k=0 (last 0 of 4 trainable) seed=123 === + trainable blocks: [] trainable params: 789,770 + [DFA k] ep 1: test=0.3219 + [DFA k] ep 10: test=0.3339 + [DFA k] ep 20: test=0.3453 + [DFA k] ep 30: test=0.3352 + [DFA k] ep 40: test=0.3322 + [DFA k] ep 50: test=0.3291 + [DFA k] ep 60: test=0.3428 + [DFA k] ep 70: test=0.3447 + [DFA k] ep 80: test=0.3465 + [DFA k] ep 90: test=0.3464 + [DFA k] ep 100: test=0.3498 + FINAL dfa k=0 seed=123: 0.3498 + +=== DFA k=0 (last 0 of 4 trainable) seed=456 === + trainable blocks: [] trainable params: 789,770 + [DFA k] ep 1: test=0.3241 + [DFA k] ep 10: test=0.3486 + [DFA k] ep 20: test=0.3396 + [DFA k] ep 30: test=0.3396 + [DFA k] ep 40: test=0.3387 + [DFA k] ep 50: test=0.3456 + [DFA k] ep 60: test=0.3508 + [DFA k] ep 70: test=0.3527 + [DFA k] ep 80: test=0.3498 + [DFA k] ep 90: test=0.3508 + [DFA k] ep 100: test=0.3516 + FINAL dfa k=0 seed=456: 0.3516 + +=== DFA k=1 (last 1 of 4 trainable) seed=42 === + trainable blocks: [3] trainable params: 921,866 + [DFA k] ep 1: test=0.2563 + [DFA k] ep 10: test=0.2580 + [DFA k] ep 20: test=0.2445 + [DFA k] ep 30: test=0.2197 + [DFA k] ep 40: test=0.2229 + [DFA k] ep 50: test=0.1952 + [DFA k] ep 60: test=0.2306 + [DFA k] ep 70: test=0.2290 + [DFA k] ep 80: test=0.2211 + [DFA k] ep 90: test=0.2215 + [DFA k] ep 100: test=0.2267 + FINAL dfa k=1 seed=42: 0.2267 + +=== DFA k=1 (last 1 of 4 trainable) seed=123 === + trainable blocks: [3] trainable params: 921,866 + [DFA k] ep 1: test=0.2549 + [DFA k] ep 10: test=0.2505 + [DFA k] ep 20: test=0.2453 + [DFA k] ep 30: test=0.2358 + [DFA k] ep 40: test=0.2499 + [DFA k] ep 50: test=0.2506 + [DFA k] ep 60: test=0.2467 + [DFA k] ep 70: test=0.2513 + [DFA k] ep 80: test=0.2597 + [DFA k] ep 90: test=0.2586 + [DFA k] ep 100: test=0.2563 + FINAL dfa k=1 seed=123: 0.2563 + +=== DFA k=1 (last 1 of 4 trainable) seed=456 === + trainable blocks: [3] trainable params: 921,866 + [DFA k] ep 1: test=0.2112 + [DFA k] ep 10: test=0.2227 + [DFA k] ep 20: test=0.2397 + [DFA k] ep 30: test=0.2326 + [DFA k] ep 40: test=0.2285 + [DFA k] ep 50: test=0.2176 + [DFA k] ep 60: test=0.2431 + [DFA k] ep 70: test=0.2476 + [DFA k] ep 80: test=0.2493 + [DFA k] ep 90: test=0.2477 + [DFA k] ep 100: test=0.2476 + FINAL dfa k=1 seed=456: 0.2476 + +=== DFA k=2 (last 2 of 4 trainable) seed=42 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [DFA k] ep 1: test=0.2792 + [DFA k] ep 10: test=0.2893 + [DFA k] ep 20: test=0.2978 + [DFA k] ep 30: test=0.2960 + [DFA k] ep 40: test=0.3010 + [DFA k] ep 50: test=0.3014 + [DFA k] ep 60: test=0.3005 + [DFA k] ep 70: test=0.3036 + [DFA k] ep 80: test=0.2997 + [DFA k] ep 90: test=0.3005 + [DFA k] ep 100: test=0.3005 + FINAL dfa k=2 seed=42: 0.3005 + +=== DFA k=2 (last 2 of 4 trainable) seed=123 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [DFA k] ep 1: test=0.2671 + [DFA k] ep 10: test=0.2947 + [DFA k] ep 20: test=0.2841 + [DFA k] ep 30: test=0.2801 + [DFA k] ep 40: test=0.2819 + [DFA k] ep 50: test=0.2772 + [DFA k] ep 60: test=0.2834 + [DFA k] ep 70: test=0.2876 + [DFA k] ep 80: test=0.2757 + [DFA k] ep 90: test=0.2806 + [DFA k] ep 100: test=0.2819 + FINAL dfa k=2 seed=123: 0.2819 + +=== DFA k=2 (last 2 of 4 trainable) seed=456 === + trainable blocks: [2, 3] trainable params: 1,053,962 + [DFA k] ep 1: test=0.2604 + [DFA k] ep 10: test=0.2821 + [DFA k] ep 20: test=0.2784 + [DFA k] ep 30: test=0.2826 + [DFA k] ep 40: test=0.2805 + [DFA k] ep 50: test=0.2675 + [DFA k] ep 60: test=0.2735 + [DFA k] ep 70: test=0.2765 + [DFA k] ep 80: test=0.2735 + [DFA k] ep 90: test=0.2759 + [DFA k] ep 100: test=0.2751 + FINAL dfa k=2 seed=456: 0.2751 + +=== DFA k=3 (last 3 of 4 trainable) seed=42 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [DFA k] ep 1: test=0.2821 + [DFA k] ep 10: test=0.2882 + [DFA k] ep 20: test=0.2921 + [DFA k] ep 30: test=0.3064 + [DFA k] ep 40: test=0.3009 + [DFA k] ep 50: test=0.3044 + [DFA k] ep 60: test=0.3041 + [DFA k] ep 70: test=0.3075 + [DFA k] ep 80: test=0.3064 + [DFA k] ep 90: test=0.3021 + [DFA k] ep 100: test=0.3047 + FINAL dfa k=3 seed=42: 0.3047 + +=== DFA k=3 (last 3 of 4 trainable) seed=123 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [DFA k] ep 1: test=0.2630 + [DFA k] ep 10: test=0.2910 + [DFA k] ep 20: test=0.2845 + [DFA k] ep 30: test=0.2821 + [DFA k] ep 40: test=0.2900 + [DFA k] ep 50: test=0.2811 + [DFA k] ep 60: test=0.2860 + [DFA k] ep 70: test=0.2910 + [DFA k] ep 80: test=0.2879 + [DFA k] ep 90: test=0.2910 + [DFA k] ep 100: test=0.2906 + FINAL dfa k=3 seed=123: 0.2906 + +=== DFA k=3 (last 3 of 4 trainable) seed=456 === + trainable blocks: [1, 2, 3] trainable params: 1,186,058 + [DFA k] ep 1: test=0.2544 + [DFA k] ep 10: test=0.2841 + [DFA k] ep 20: test=0.2892 + [DFA k] ep 30: test=0.2998 + [DFA k] ep 40: test=0.2891 + [DFA k] ep 50: test=0.2844 + [DFA k] ep 60: test=0.2938 + [DFA k] ep 70: test=0.2928 + [DFA k] ep 80: test=0.2901 + [DFA k] ep 90: test=0.2932 + [DFA k] ep 100: test=0.2919 + FINAL dfa k=3 seed=456: 0.2919 + +=== DFA k=4 (last 4 of 4 trainable) seed=42 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [DFA k] ep 1: test=0.2899 + [DFA k] ep 10: test=0.2873 + [DFA k] ep 20: test=0.3016 + [DFA k] ep 30: test=0.3053 + [DFA k] ep 40: test=0.3120 + [DFA k] ep 50: test=0.3045 + [DFA k] ep 60: test=0.3071 + [DFA k] ep 70: test=0.3102 + [DFA k] ep 80: test=0.3080 + [DFA k] ep 90: test=0.3066 + [DFA k] ep 100: test=0.3068 + FINAL dfa k=4 seed=42: 0.3068 + +=== DFA k=4 (last 4 of 4 trainable) seed=123 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [DFA k] ep 1: test=0.2683 + [DFA k] ep 10: test=0.2926 + [DFA k] ep 20: test=0.2861 + [DFA k] ep 30: test=0.2875 + [DFA k] ep 40: test=0.2978 + [DFA k] ep 50: test=0.2910 + [DFA k] ep 60: test=0.2972 + [DFA k] ep 70: test=0.3011 + [DFA k] ep 80: test=0.2974 + [DFA k] ep 90: test=0.3015 + [DFA k] ep 100: test=0.3023 + FINAL dfa k=4 seed=123: 0.3023 + +=== DFA k=4 (last 4 of 4 trainable) seed=456 === + trainable blocks: [0, 1, 2, 3] trainable params: 1,318,154 + [DFA k] ep 1: test=0.2591 + [DFA k] ep 10: test=0.2883 + [DFA k] ep 20: test=0.2948 + [DFA k] ep 30: test=0.2995 + [DFA k] ep 40: test=0.2921 + [DFA k] ep 50: test=0.2956 + [DFA k] ep 60: test=0.2960 + [DFA k] ep 70: test=0.2943 + [DFA k] ep 80: test=0.2910 + [DFA k] ep 90: test=0.2955 + [DFA k] ep 100: test=0.2949 + FINAL dfa k=4 seed=456: 0.2949 + +============================================================ +SUMMARY ladder_d256_L4_cifar10 (mean ± ddof-1 std over seeds) +============================================================ + BP k=0: 0.3886±0.0011 k=1: 0.5650±0.0031 k=2: 0.5978±0.0034 k=3: 0.6076±0.0006 k=4: 0.6167±0.0019 + FA k=0: 0.3551±0.0029 k=1: 0.3820±0.0078 k=2: 0.3487±0.0160 k=3: 0.3983±0.0079 k=4: 0.4023±0.0087 + DFA k=0: 0.3489±0.0032 k=1: 0.2435±0.0152 k=2: 0.2858±0.0131 k=3: 0.2957±0.0078 k=4: 0.3013±0.0060 + +Saved -> results/depth_ladder/ladder_d256_L4_cifar10.json +[Sun Jun 14 03:26:20 PM CDT 2026] START secondary d=512 L=2 FA-failure ladder +Device=cuda:0 ladder_d512_L2_cifar10 methods=['bp', 'fa', 'dfa'] k=[0, 1, 2] seeds=[42, 123, 456] epochs=100 + +=== BP k=0 (last 0 of 2 trainable) seed=42 === + trainable blocks: [] trainable params: 1,579,530 + [BP k] ep 1: test=0.3462 + [BP k] ep 10: test=0.3633 + [BP k] ep 20: test=0.3635 + [BP k] ep 30: test=0.3543 + [BP k] ep 40: test=0.3673 + [BP k] ep 50: test=0.3633 + [BP k] ep 60: test=0.3695 + [BP k] ep 70: test=0.3753 + [BP k] ep 80: test=0.3858 + [BP k] ep 90: test=0.3887 + [BP k] ep 100: test=0.3891 + FINAL bp k=0 seed=42: 0.3891 + +=== BP k=0 (last 0 of 2 trainable) seed=123 === + trainable blocks: [] trainable params: 1,579,530 + [BP k] ep 1: test=0.3497 + [BP k] ep 10: test=0.3704 + [BP k] ep 20: test=0.3698 + [BP k] ep 30: test=0.3540 + [BP k] ep 40: test=0.3505 + [BP k] ep 50: test=0.3634 + [BP k] ep 60: test=0.3675 + [BP k] ep 70: test=0.3739 + [BP k] ep 80: test=0.3823 + [BP k] ep 90: test=0.3845 + [BP k] ep 100: test=0.3846 + FINAL bp k=0 seed=123: 0.3846 + +=== BP k=0 (last 0 of 2 trainable) seed=456 === + trainable blocks: [] trainable params: 1,579,530 + [BP k] ep 1: test=0.3409 + [BP k] ep 10: test=0.3578 + [BP k] ep 20: test=0.3767 + [BP k] ep 30: test=0.3607 + [BP k] ep 40: test=0.3551 + [BP k] ep 50: test=0.3632 + [BP k] ep 60: test=0.3722 + [BP k] ep 70: test=0.3704 + [BP k] ep 80: test=0.3784 + [BP k] ep 90: test=0.3834 + [BP k] ep 100: test=0.3838 + FINAL bp k=0 seed=456: 0.3838 + +=== BP k=1 (last 1 of 2 trainable) seed=42 === + trainable blocks: [1] trainable params: 2,105,866 + [BP k] ep 1: test=0.3667 + [BP k] ep 10: test=0.4836 + [BP k] ep 20: test=0.5197 + [BP k] ep 30: test=0.5367 + [BP k] ep 40: test=0.5444 + [BP k] ep 50: test=0.5629 + [BP k] ep 60: test=0.5691 + [BP k] ep 70: test=0.5779 + [BP k] ep 80: test=0.5808 + [BP k] ep 90: test=0.5849 + [BP k] ep 100: test=0.5856 + FINAL bp k=1 seed=42: 0.5856 + +=== BP k=1 (last 1 of 2 trainable) seed=123 === + trainable blocks: [1] trainable params: 2,105,866 + [BP k] ep 1: test=0.3632 + [BP k] ep 10: test=0.4865 + [BP k] ep 20: test=0.5175 + [BP k] ep 30: test=0.5360 + [BP k] ep 40: test=0.5466 + [BP k] ep 50: test=0.5606 + [BP k] ep 60: test=0.5716 + [BP k] ep 70: test=0.5749 + [BP k] ep 80: test=0.5806 + [BP k] ep 90: test=0.5817 + [BP k] ep 100: test=0.5819 + FINAL bp k=1 seed=123: 0.5819 + +=== BP k=1 (last 1 of 2 trainable) seed=456 === + trainable blocks: [1] trainable params: 2,105,866 + [BP k] ep 1: test=0.3696 + [BP k] ep 10: test=0.4737 + [BP k] ep 20: test=0.5199 + [BP k] ep 30: test=0.5317 + [BP k] ep 40: test=0.5498 + [BP k] ep 50: test=0.5610 + [BP k] ep 60: test=0.5675 + [BP k] ep 70: test=0.5767 + [BP k] ep 80: test=0.5785 + [BP k] ep 90: test=0.5802 + [BP k] ep 100: test=0.5809 + FINAL bp k=1 seed=456: 0.5809 + +=== BP k=2 (last 2 of 2 trainable) seed=42 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [BP k] ep 1: test=0.3790 + [BP k] ep 10: test=0.5174 + [BP k] ep 20: test=0.5471 + [BP k] ep 30: test=0.5712 + [BP k] ep 40: test=0.5906 + [BP k] ep 50: test=0.5969 + [BP k] ep 60: test=0.5977 + [BP k] ep 70: test=0.5992 + [BP k] ep 80: test=0.6072 + [BP k] ep 90: test=0.6037 + [BP k] ep 100: test=0.6039 + FINAL bp k=2 seed=42: 0.6039 + +=== BP k=2 (last 2 of 2 trainable) seed=123 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [BP k] ep 1: test=0.3732 + [BP k] ep 10: test=0.5161 + [BP k] ep 20: test=0.5554 + [BP k] ep 30: test=0.5756 + [BP k] ep 40: test=0.5811 + [BP k] ep 50: test=0.5928 + [BP k] ep 60: test=0.5965 + [BP k] ep 70: test=0.6016 + [BP k] ep 80: test=0.6027 + [BP k] ep 90: test=0.6007 + [BP k] ep 100: test=0.6020 + FINAL bp k=2 seed=123: 0.6020 + +=== BP k=2 (last 2 of 2 trainable) seed=456 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [BP k] ep 1: test=0.3768 + [BP k] ep 10: test=0.5097 + [BP k] ep 20: test=0.5499 + [BP k] ep 30: test=0.5773 + [BP k] ep 40: test=0.5858 + [BP k] ep 50: test=0.5845 + [BP k] ep 60: test=0.5934 + [BP k] ep 70: test=0.5985 + [BP k] ep 80: test=0.6011 + [BP k] ep 90: test=0.6020 + [BP k] ep 100: test=0.6045 + FINAL bp k=2 seed=456: 0.6045 + +=== FA k=0 (last 0 of 2 trainable) seed=42 === + trainable blocks: [] trainable params: 1,579,530 + [FA k] ep 1: test=0.3288 + [FA k] ep 10: test=0.3359 + [FA k] ep 20: test=0.3336 + [FA k] ep 30: test=0.3328 + [FA k] ep 40: test=0.3418 + [FA k] ep 50: test=0.3504 + [FA k] ep 60: test=0.3564 + [FA k] ep 70: test=0.3567 + [FA k] ep 80: test=0.3543 + [FA k] ep 90: test=0.3574 + [FA k] ep 100: test=0.3585 + FINAL fa k=0 seed=42: 0.3585 + +=== FA k=0 (last 0 of 2 trainable) seed=123 === + trainable blocks: [] trainable params: 1,579,530 + [FA k] ep 1: test=0.3125 + [FA k] ep 10: test=0.3374 + [FA k] ep 20: test=0.3364 + [FA k] ep 30: test=0.3453 + [FA k] ep 40: test=0.3437 + [FA k] ep 50: test=0.3522 + [FA k] ep 60: test=0.3587 + [FA k] ep 70: test=0.3550 + [FA k] ep 80: test=0.3551 + [FA k] ep 90: test=0.3558 + [FA k] ep 100: test=0.3584 + FINAL fa k=0 seed=123: 0.3584 + +=== FA k=0 (last 0 of 2 trainable) seed=456 === + trainable blocks: [] trainable params: 1,579,530 + [FA k] ep 1: test=0.3180 + [FA k] ep 10: test=0.3311 + [FA k] ep 20: test=0.3344 + [FA k] ep 30: test=0.3533 + [FA k] ep 40: test=0.3476 + [FA k] ep 50: test=0.3523 + [FA k] ep 60: test=0.3455 + [FA k] ep 70: test=0.3569 + [FA k] ep 80: test=0.3562 + [FA k] ep 90: test=0.3583 + [FA k] ep 100: test=0.3590 + FINAL fa k=0 seed=456: 0.3590 + +=== FA k=1 (last 1 of 2 trainable) seed=42 === + trainable blocks: [1] trainable params: 2,105,866 + [FA k] ep 1: test=0.3235 + [FA k] ep 10: test=0.3730 + [FA k] ep 20: test=0.3734 + [FA k] ep 30: test=0.3829 + [FA k] ep 40: test=0.3916 + [FA k] ep 50: test=0.4008 + [FA k] ep 60: test=0.4012 + [FA k] ep 70: test=0.4015 + [FA k] ep 80: test=0.4042 + [FA k] ep 90: test=0.4082 + [FA k] ep 100: test=0.4083 + FINAL fa k=1 seed=42: 0.4083 + +=== FA k=1 (last 1 of 2 trainable) seed=123 === + trainable blocks: [1] trainable params: 2,105,866 + [FA k] ep 1: test=0.2930 + [FA k] ep 10: test=0.3662 + [FA k] ep 20: test=0.3905 + [FA k] ep 30: test=0.4027 + [FA k] ep 40: test=0.3948 + [FA k] ep 50: test=0.4048 + [FA k] ep 60: test=0.4067 + [FA k] ep 70: test=0.4094 + [FA k] ep 80: test=0.4115 + [FA k] ep 90: test=0.4103 + [FA k] ep 100: test=0.4134 + FINAL fa k=1 seed=123: 0.4134 + +=== FA k=1 (last 1 of 2 trainable) seed=456 === + trainable blocks: [1] trainable params: 2,105,866 + [FA k] ep 1: test=0.3098 + [FA k] ep 10: test=0.3561 + [FA k] ep 20: test=0.3860 + [FA k] ep 30: test=0.3957 + [FA k] ep 40: test=0.3907 + [FA k] ep 50: test=0.4032 + [FA k] ep 60: test=0.4017 + [FA k] ep 70: test=0.4125 + [FA k] ep 80: test=0.4123 + [FA k] ep 90: test=0.4164 + [FA k] ep 100: test=0.4155 + FINAL fa k=1 seed=456: 0.4155 + +=== FA k=2 (last 2 of 2 trainable) seed=42 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [FA k] ep 1: test=0.3028 + [FA k] ep 10: test=0.3585 + [FA k] ep 20: test=0.3523 + [FA k] ep 30: test=0.3315 + [FA k] ep 40: test=0.3191 + [FA k] ep 50: test=0.3397 + [FA k] ep 60: test=0.3566 + [FA k] ep 70: test=0.3527 + [FA k] ep 80: test=0.3554 + [FA k] ep 90: test=0.3593 + [FA k] ep 100: test=0.3582 + FINAL fa k=2 seed=42: 0.3582 + +=== FA k=2 (last 2 of 2 trainable) seed=123 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [FA k] ep 1: test=0.2794 + [FA k] ep 10: test=0.3627 + [FA k] ep 20: test=0.3600 + [FA k] ep 30: test=0.3750 + [FA k] ep 40: test=0.3482 + [FA k] ep 50: test=0.3679 + [FA k] ep 60: test=0.3630 + [FA k] ep 70: test=0.3643 + [FA k] ep 80: test=0.3636 + [FA k] ep 90: test=0.3618 + [FA k] ep 100: test=0.3621 + FINAL fa k=2 seed=123: 0.3621 + +=== FA k=2 (last 2 of 2 trainable) seed=456 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [FA k] ep 1: test=0.3005 + [FA k] ep 10: test=0.3573 + [FA k] ep 20: test=0.3624 + [FA k] ep 30: test=0.3706 + [FA k] ep 40: test=0.3529 + [FA k] ep 50: test=0.3648 + [FA k] ep 60: test=0.3581 + [FA k] ep 70: test=0.3645 + [FA k] ep 80: test=0.3652 + [FA k] ep 90: test=0.3632 + [FA k] ep 100: test=0.3642 + FINAL fa k=2 seed=456: 0.3642 + +=== DFA k=0 (last 0 of 2 trainable) seed=42 === + trainable blocks: [] trainable params: 1,579,530 + [DFA k] ep 1: test=0.3196 + [DFA k] ep 10: test=0.3187 + [DFA k] ep 20: test=0.3369 + [DFA k] ep 30: test=0.3221 + [DFA k] ep 40: test=0.3386 + [DFA k] ep 50: test=0.3401 + [DFA k] ep 60: test=0.3473 + [DFA k] ep 70: test=0.3472 + [DFA k] ep 80: test=0.3426 + [DFA k] ep 90: test=0.3445 + [DFA k] ep 100: test=0.3432 + FINAL dfa k=0 seed=42: 0.3432 + +=== DFA k=0 (last 0 of 2 trainable) seed=123 === + trainable blocks: [] trainable params: 1,579,530 + [DFA k] ep 1: test=0.3089 + [DFA k] ep 10: test=0.3180 + [DFA k] ep 20: test=0.3301 + [DFA k] ep 30: test=0.3434 + [DFA k] ep 40: test=0.3386 + [DFA k] ep 50: test=0.3343 + [DFA k] ep 60: test=0.3489 + [DFA k] ep 70: test=0.3458 + [DFA k] ep 80: test=0.3499 + [DFA k] ep 90: test=0.3508 + [DFA k] ep 100: test=0.3508 + FINAL dfa k=0 seed=123: 0.3508 + +=== DFA k=0 (last 0 of 2 trainable) seed=456 === + trainable blocks: [] trainable params: 1,579,530 + [DFA k] ep 1: test=0.3238 + [DFA k] ep 10: test=0.3327 + [DFA k] ep 20: test=0.3395 + [DFA k] ep 30: test=0.3457 + [DFA k] ep 40: test=0.3367 + [DFA k] ep 50: test=0.3496 + [DFA k] ep 60: test=0.3453 + [DFA k] ep 70: test=0.3487 + [DFA k] ep 80: test=0.3491 + [DFA k] ep 90: test=0.3498 + [DFA k] ep 100: test=0.3521 + FINAL dfa k=0 seed=456: 0.3521 + +=== DFA k=1 (last 1 of 2 trainable) seed=42 === + trainable blocks: [1] trainable params: 2,105,866 + [DFA k] ep 1: test=0.2687 + [DFA k] ep 10: test=0.2106 + [DFA k] ep 20: test=0.2293 + [DFA k] ep 30: test=0.2297 + [DFA k] ep 40: test=0.2241 + [DFA k] ep 50: test=0.2318 + [DFA k] ep 60: test=0.2417 + [DFA k] ep 70: test=0.2458 + [DFA k] ep 80: test=0.2463 + [DFA k] ep 90: test=0.2438 + [DFA k] ep 100: test=0.2384 + FINAL dfa k=1 seed=42: 0.2384 + +=== DFA k=1 (last 1 of 2 trainable) seed=123 === + trainable blocks: [1] trainable params: 2,105,866 + [DFA k] ep 1: test=0.1958 + [DFA k] ep 10: test=0.1777 + [DFA k] ep 20: test=0.2220 + [DFA k] ep 30: test=0.1852 + [DFA k] ep 40: test=0.2165 + [DFA k] ep 50: test=0.2095 + [DFA k] ep 60: test=0.1995 + [DFA k] ep 70: test=0.2038 + [DFA k] ep 80: test=0.2068 + [DFA k] ep 90: test=0.2173 + [DFA k] ep 100: test=0.2097 + FINAL dfa k=1 seed=123: 0.2097 + +=== DFA k=1 (last 1 of 2 trainable) seed=456 === + trainable blocks: [1] trainable params: 2,105,866 + [DFA k] ep 1: test=0.2118 + [DFA k] ep 10: test=0.2074 + [DFA k] ep 20: test=0.1777 + [DFA k] ep 30: test=0.2043 + [DFA k] ep 40: test=0.2010 + [DFA k] ep 50: test=0.2087 + [DFA k] ep 60: test=0.2073 + [DFA k] ep 70: test=0.2126 + [DFA k] ep 80: test=0.2202 + [DFA k] ep 90: test=0.2355 + [DFA k] ep 100: test=0.2295 + FINAL dfa k=1 seed=456: 0.2295 + +=== DFA k=2 (last 2 of 2 trainable) seed=42 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [DFA k] ep 1: test=0.2769 + [DFA k] ep 10: test=0.2705 + [DFA k] ep 20: test=0.3000 + [DFA k] ep 30: test=0.2988 + [DFA k] ep 40: test=0.3080 + [DFA k] ep 50: test=0.2941 + [DFA k] ep 60: test=0.3025 + [DFA k] ep 70: test=0.3075 + [DFA k] ep 80: test=0.3070 + [DFA k] ep 90: test=0.3063 + [DFA k] ep 100: test=0.3069 + FINAL dfa k=2 seed=42: 0.3069 + +=== DFA k=2 (last 2 of 2 trainable) seed=123 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [DFA k] ep 1: test=0.2582 + [DFA k] ep 10: test=0.2772 + [DFA k] ep 20: test=0.2904 + [DFA k] ep 30: test=0.3072 + [DFA k] ep 40: test=0.2898 + [DFA k] ep 50: test=0.2938 + [DFA k] ep 60: test=0.2892 + [DFA k] ep 70: test=0.2974 + [DFA k] ep 80: test=0.2970 + [DFA k] ep 90: test=0.3035 + [DFA k] ep 100: test=0.3025 + FINAL dfa k=2 seed=123: 0.3025 + +=== DFA k=2 (last 2 of 2 trainable) seed=456 === + trainable blocks: [0, 1] trainable params: 2,632,202 + [DFA k] ep 1: test=0.2794 + [DFA k] ep 10: test=0.2888 + [DFA k] ep 20: test=0.2884 + [DFA k] ep 30: test=0.2901 + [DFA k] ep 40: test=0.2784 + [DFA k] ep 50: test=0.2817 + [DFA k] ep 60: test=0.2983 + [DFA k] ep 70: test=0.2920 + [DFA k] ep 80: test=0.2904 + [DFA k] ep 90: test=0.2999 + [DFA k] ep 100: test=0.2963 + FINAL dfa k=2 seed=456: 0.2963 + +============================================================ +SUMMARY ladder_d512_L2_cifar10 (mean ± ddof-1 std over seeds) +============================================================ + BP k=0: 0.3858±0.0029 k=1: 0.5828±0.0025 k=2: 0.6035±0.0013 + FA k=0: 0.3586±0.0003 k=1: 0.4124±0.0037 k=2: 0.3615±0.0030 + DFA k=0: 0.3487±0.0048 k=1: 0.2259±0.0147 k=2: 0.3019±0.0053 + +Saved -> results/depth_ladder/ladder_d512_L2_cifar10.json +[Sun Jun 14 05:49:05 PM CDT 2026] ALL DONE |
