From 9d0e4901a82763ea3ebc57eea152a730330d4991 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Wed, 8 Apr 2026 00:00:34 -0500 Subject: EVIDENCE_SUMMARY: add (d) threshold sensitivity finding (round 18) --- protocol/EVIDENCE_SUMMARY.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/protocol/EVIDENCE_SUMMARY.md b/protocol/EVIDENCE_SUMMARY.md index 93f3968..0da2e75 100644 --- a/protocol/EVIDENCE_SUMMARY.md +++ b/protocol/EVIDENCE_SUMMARY.md @@ -48,7 +48,9 @@ and the file or memory entry where the result is recorded. | evidence | result | reproduce | |---|---|---| | Threshold sensitivity sweep | (a) **63× separation gap**, (b) **24,338× separation gap** between healthy and degenerate | `python -m protocol.examples.threshold_sensitivity` | -| Default thresholds | sit cleanly in the middle of substantial margins | (in sensitivity output) | +| Default thresholds (a)+(b) | sit cleanly in the middle of substantial margins; verdicts robust to ±50% perturbation | (in sensitivity output) | +| Diagnostic (d) frozen-baseline threshold | **NOT robust** — penalized DFA at λ=1e-2 fires at 2 pp threshold, passes at 1 pp threshold; at λ=1e-3 (1 seed) margin is +2.3 pp which passes at 2 pp. The (d) verdict depends on both threshold choice and λ choice. | `python -m protocol.examples.threshold_d_sensitivity` | +| Round 18 lesson | Soften language: "after the penalty correction, the depth contribution is at most 1.4 pp above the random-blocks baseline at λ=1e-2 — much smaller than BP's +26 pp gap over shallow", NOT "the deep blocks are passive". | n/a | ## §3.6 Cross-width validation (d=512) -- cgit v1.2.3