Protocol diagnostic (a): use max per-block growth, not max/min ratio

2026-04-08T04:00:54+00:00

Old metric: max(||h||) / max(||h_0||, eps). False-positives on ViT-style
architectures because the cls token at layer 0 (right after patch_embed)
has anomalously small magnitude (~0.3-1.5), inflating the ratio even on
healthy BP-trained ViTs.

New metric: max_l(||h_{l+1}|| / ||h_l||) — the largest single-block
residual amplification. Architecture-invariant.

Calibration:
  - BP-trained, late training: <5x per block
  - BP ViT, early epochs (cls token resolving): 13-25x max
  - DFA-trained ResMLP/ViT: 100-4000x per block
Threshold raised from 10 to 50 to sit cleanly between healthy-early-
training (max 25) and failure-regime (min 100).

Re-verifications:
  - smoke test (BP/DFA/EP): all 3 verdicts unchanged
  - random init (3 seeds): trustworthy on all 3
  - 5-method audit table single-seed: identical verdicts
  - decision-utility ablation: identical (still 0/5 by S1, 3/5 by S_full)
  - temporal evolution 3-seed: (b) now fires first at ep 3-4, (a) at ep
    8-11. Both well before training ends. The 'protocol fires ~92 epochs
    early' story still holds.
  - ViT temporal evolution: BP no longer false-fires; DFA fires (a) ep 1,
    (b) ep 3 — protocol works on the second architecture.

Add FA diagnostic protocol reference implementation

2026-04-08T03:20:48+00:00

Codex round 15 #1 priority for the E&D-track paper:
  - protocol/protocol.py: 4 diagnostics (residual norms, BP grad norms,
    cross-batch direction stability, and a frozen-baseline comparator)
  - protocol/report.py: DiagnosticReport with per-diagnostic verdicts and
    pretty-printer
  - protocol/smoke_test.py: validates BP/DFA/EP checkpoints produce the
    expected verdicts (BP/EP trustworthy; DFA walked back via residual
    explosion + BP grad at floor)
  - protocol/README.md: usage, audit cases, threshold rationale
  - protocol/CHECKLIST.md: 6 evaluation pipeline pitfalls (norm(-1),
    cosine_similarity eps clamp, fp16 underflow, Bs reproducibility,
    aggregation, layer-0 dominance)
  - protocol/REPORTING_TEMPLATE.md: per-method fillable form for FA papers

faeval.git/protocol/report.py, branch master

Protocol diagnostic (a): use max per-block growth, not max/min ratio

Add FA diagnostic protocol reference implementation