Add reproducers for pitfalls 4-6 (Bs reproducibility, aggregation, layer-0)

2026-04-08T04:09:03+00:00

All 3 verified on the real DFA s42 checkpoint:

  Bug 4: training Bs gives Γ=+0.068, 10 fresh Bs draws give Γ=+0.0043±0.007.
         The 'alignment' is the network adapting to specific Bs.

  Bug 5: 4 valid aggregation strategies give Γ in [-0.028, +0.074]. The
         spread is 0.10 (3.45x ratio) and **the sign flips** between
         strategies. Pick the wrong aggregation and DFA is anti-aligned;
         pick the right one and DFA looks aligned.

  Bug 6: Γ_layer0 = +0.429 dominates the mean +0.068. Hidden layers 1-4 are
         all near zero or slightly negative. Mean of hidden layers only is
         -0.022 (negative!). The deep blocks the paper claims to be
         'training' have Γ ≈ 0 or below.

Bugs 5 and 6 are causally linked: 'median over layers' strategies pick a
negative deep layer; 'mean over layers' is dominated by the positive l0.

The catalog under-reported bug 5 (it said 2.5x, actual is 3.45x with sign
flip).

faeval.git/protocol/examples/verify_pitfalls_4_6.py, branch master

Add reproducers for pitfalls 4-6 (Bs reproducibility, aggregation, layer-0)