summaryrefslogtreecommitdiff
path: root/protocol
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:56:33 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-04-07 23:56:33 -0500
commitcbe851cf382a2af13037304afdd783214bad5c6b (patch)
tree2c193c63a61ebf25183f00058b6a517faeabd5f3 /protocol
parentec20a776e0c3e026236942fe99f3840a39e211fd (diff)
EVIDENCE_SUMMARY: round 18 language softening on CNN + penalty audit
Diffstat (limited to 'protocol')
-rw-r--r--protocol/EVIDENCE_SUMMARY.md24
1 files changed, 16 insertions, 8 deletions
diff --git a/protocol/EVIDENCE_SUMMARY.md b/protocol/EVIDENCE_SUMMARY.md
index f784d2f..93f3968 100644
--- a/protocol/EVIDENCE_SUMMARY.md
+++ b/protocol/EVIDENCE_SUMMARY.md
@@ -68,12 +68,17 @@ and the file or memory entry where the result is recorded.
| Credit Bridge CNN | 0.325 ± 0.009 | 96× | 3e-3 | walk-back via (a) only |
**Key**: diagnostic (b) NEVER fires on CNN. Without terminal LN, BP grad does
-not collapse below 1e-7. Combined with the StudentNet result, this shows
-(b) is causally specific to LN architectures. DFA CNN reaches 0.566 (much
-higher than DFA ResMLP 0.31 / DFA ViT 0.24), consistent with the
+not collapse below 1e-7. Combined with the StudentNet result, **(b) appears
+restricted to the terminal-normalized architectures we audited** (round 18
+softening: this is observational association across the architectures
+tested, not causal identification of LayerNorm). DFA CNN reaches 0.566
+(much higher than DFA ResMLP 0.31 / DFA ViT 0.24), consistent with the
literature: classical FA papers report DFA working on shallow CNNs but
-failing on modern Transformers — the protocol gives the mechanistic
-reason (catastrophic (a)+(b) on with-LN vs mild (a) only on without-LN).
+failing on modern Transformers. On CNN the cosine remains in a measurable
+regime (Γ=0.916 for DFA), but the training trajectory exhibits extreme
+scale distortion (max-per-block growth 237×), so the headline Γ alone is
+not a trustworthy summary of learning quality even though the cosine
+itself is well-defined.
Reproduce: `python -m protocol.examples.audit_cnn`
@@ -85,9 +90,12 @@ Reproduce: `python -m protocol.examples.audit_cnn`
| **Penalty partial protocol audit** | Penalized DFA: (a)+(b) **PASS** (penalty fixes scale), but (d) **STILL FIRES** on 3/3 seeds (margin 1.38 ± 0.05 pp < 2 pp) | `python -m protocol.examples.penalty_partial_audit` |
| Vanilla DFA per-layer cosine (3 seeds) | layer 0: cos = +0.42 (high), layers 1-4: cos ≈ 0 (range -0.03 to +0.03). Headline +0.07 is entirely from layer 0. | `python experiments/measure_direction_quality_existing_ckpt.py --seed 42` |
-The two failure modes are mechanistically separable: the penalty fixes the
-scale failure (a+b pass) but not the direction failure (d still fires).
-This is the cleanest possible separability evidence.
+The two putative failure modes are **partially dissociated by intervention**
+(round 18 softening): the penalty alleviates the scale-related diagnostics
+(a)+(b) while the frozen-baseline diagnostic (d) still fires. (d) provides
+independent evidence that poor use of depth persists after the scale
+pathology is reduced. Full mechanistic separability requires direct
+deep-block credit measurement on the penalized checkpoint (in progress).
## §5 Pipeline pitfalls reproducers