summaryrefslogtreecommitdiff
path: root/results/frozen_d512_baselines.log
diff options
context:
space:
mode:
Diffstat (limited to 'results/frozen_d512_baselines.log')
-rw-r--r--results/frozen_d512_baselines.log111
1 files changed, 111 insertions, 0 deletions
diff --git a/results/frozen_d512_baselines.log b/results/frozen_d512_baselines.log
new file mode 100644
index 0000000..7a1a42d
--- /dev/null
+++ b/results/frozen_d512_baselines.log
@@ -0,0 +1,111 @@
+=== FROZEN BASELINES d=512 ===
+Start: Sat Apr 25 10:42:45 PM CDT 2026
+ d=512 L=4 s=42 (Sat Apr 25 10:42:45 PM CDT 2026)
+ DFA-shallow: 0.3458
+ DFA-frozen: 0.3445
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=4 s=123 (Sat Apr 25 11:22:20 PM CDT 2026)
+ DFA-shallow: 0.3524
+ DFA-frozen: 0.3506
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=4 s=456 (Sun Apr 26 12:01:58 AM CDT 2026)
+ DFA-shallow: 0.3516
+ DFA-frozen: 0.3514
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=2 s=42 (Sun Apr 26 12:41:03 AM CDT 2026)
+ DFA-shallow: 0.3458
+ DFA-frozen: 0.3452
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=2 s=123 (Sun Apr 26 01:20:51 AM CDT 2026)
+ DFA-shallow: 0.3524
+ DFA-frozen: 0.3502
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=2 s=456 (Sun Apr 26 01:59:55 AM CDT 2026)
+ DFA-shallow: 0.3516
+ DFA-frozen: 0.3514
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=8 s=42 (Sun Apr 26 02:39:45 AM CDT 2026)
+ DFA-shallow: 0.3458
+ DFA-frozen: 0.3432
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=8 s=123 (Sun Apr 26 03:19:06 AM CDT 2026)
+ DFA-shallow: 0.3524
+ DFA-frozen: 0.3505
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=8 s=456 (Sun Apr 26 03:58:23 AM CDT 2026)
+ DFA-shallow: 0.3516
+ DFA-frozen: 0.3508
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=12 s=42 (Sun Apr 26 04:37:35 AM CDT 2026)
+ DFA-shallow: 0.3458
+ DFA-frozen: 0.3435
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=12 s=123 (Sun Apr 26 05:17:07 AM CDT 2026)
+ DFA-shallow: 0.3524
+ DFA-frozen: 0.3526
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+ d=512 L=12 s=456 (Sun Apr 26 05:56:51 AM CDT 2026)
+ DFA-shallow: 0.3516
+ DFA-frozen: 0.3513
+
+Compare to trainable 4-block ResMLP (3-seed): BP=0.6147 100ep / 0.585 30ep, DFA=0.306 100ep / 0.301 30ep
+
+Interpretation:
+ If DFA-frozen ≈ DFA-trainable: blocks are passengers, walk-back parallels ViT
+ If DFA-frozen << DFA-trainable: ResMLP DFA actually trains the blocks (interesting contrast with ViT)
+=== FROZEN BASELINES DONE (Sun Apr 26 06:36:08 AM CDT 2026) ===