<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/protocol/examples/threshold_sensitivity.py, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Add threshold sensitivity analysis: (a) 63x gap, (b) 24338x gap</title>
<updated>2026-04-08T04:12:50+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:12:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=0886902ab7decd7702c9764e8fb2c3e10a528e45'/>
<id>0886902ab7decd7702c9764e8fb2c3e10a528e45</id>
<content type='text'>
For each diagnostic, sweeps threshold across orders of magnitude on the
3-seed audit data and reports the verdict at each value.

Key calibration findings (3 seeds):

  Diagnostic (a) max per-block growth:
    Healthy max (BP/EP):       11.0
    Degenerate min (DFA/SB/CB): 694
    Separation gap: 63x
    Default threshold 50 sits comfortably in the middle.
    Any threshold in [12, 693] gives the same verdicts.

  Diagnostic (b) ||g_L|| at floor:
    Healthy min (BP/EP):       1.02e-4
    Degenerate max (DFA/SB/CB): 4.18e-9
    Separation gap: 24,338x
    Default threshold 1e-7 sits comfortably in the middle.
    Any threshold in [4.2e-9, 1.0e-4] gives the same verdicts.

  Diagnostic (c) cross-batch stability:
    NOT a clean binary discriminator across seeds. BP s456=0.114
    near threshold; DFA s42=0.047 (noise sub-mode) doesn't fire;
    SB s456=0.035 (noise sub-mode) doesn't fire. (c) is for sub-mode
    interpretation, not binary detection.

This is the calibration evidence answering the E&amp;D reviewer question
'why these specific thresholds?'.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
For each diagnostic, sweeps threshold across orders of magnitude on the
3-seed audit data and reports the verdict at each value.

Key calibration findings (3 seeds):

  Diagnostic (a) max per-block growth:
    Healthy max (BP/EP):       11.0
    Degenerate min (DFA/SB/CB): 694
    Separation gap: 63x
    Default threshold 50 sits comfortably in the middle.
    Any threshold in [12, 693] gives the same verdicts.

  Diagnostic (b) ||g_L|| at floor:
    Healthy min (BP/EP):       1.02e-4
    Degenerate max (DFA/SB/CB): 4.18e-9
    Separation gap: 24,338x
    Default threshold 1e-7 sits comfortably in the middle.
    Any threshold in [4.2e-9, 1.0e-4] gives the same verdicts.

  Diagnostic (c) cross-batch stability:
    NOT a clean binary discriminator across seeds. BP s456=0.114
    near threshold; DFA s42=0.047 (noise sub-mode) doesn't fire;
    SB s456=0.035 (noise sub-mode) doesn't fire. (c) is for sub-mode
    interpretation, not binary detection.

This is the calibration evidence answering the E&amp;D reviewer question
'why these specific thresholds?'.
</pre>
</div>
</content>
</entry>
</feed>
