Add threshold sensitivity analysis: (a) 63x gap, (b) 24338x gap

2026-04-08T04:12:50+00:00

For each diagnostic, sweeps threshold across orders of magnitude on the
3-seed audit data and reports the verdict at each value.

Key calibration findings (3 seeds):

  Diagnostic (a) max per-block growth:
    Healthy max (BP/EP):       11.0
    Degenerate min (DFA/SB/CB): 694
    Separation gap: 63x
    Default threshold 50 sits comfortably in the middle.
    Any threshold in [12, 693] gives the same verdicts.

  Diagnostic (b) ||g_L|| at floor:
    Healthy min (BP/EP):       1.02e-4
    Degenerate max (DFA/SB/CB): 4.18e-9
    Separation gap: 24,338x
    Default threshold 1e-7 sits comfortably in the middle.
    Any threshold in [4.2e-9, 1.0e-4] gives the same verdicts.

  Diagnostic (c) cross-batch stability:
    NOT a clean binary discriminator across seeds. BP s456=0.114
    near threshold; DFA s42=0.047 (noise sub-mode) doesn't fire;
    SB s456=0.035 (noise sub-mode) doesn't fire. (c) is for sub-mode
    interpretation, not binary detection.

This is the calibration evidence answering the E&D reviewer question
'why these specific thresholds?'.

faeval.git/protocol/examples/threshold_sensitivity.py, branch master

Add threshold sensitivity analysis: (a) 63x gap, (b) 24338x gap