summaryrefslogtreecommitdiff
path: root/protocol/EVIDENCE_SUMMARY.md
AgeCommit message (Collapse)Author
2026-04-08Sync EVIDENCE_SUMMARY.md and PAPER_OUTLINE.md with v2.32 valuesYurenHao0426
These two project scratch documents had stale BP=0.609 and DFA=0.308 references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed values that v2.31-v2.32 corrected: BP no-pen 30ep: 0.609 → 0.585 ± 0.001 BP+pen 30ep: 0.530 → 0.532 ± 0.006 DFA no-pen 30ep: 0.308 → 0.301 ± 0.005 DFA+pen 30ep: 0.363 → 0.360 ± 0.001 Gap math: +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp Deep cos: +0.155 → +0.151 Now the paper, the protocol library, the README, the helper scripts, and the project scratch docs all agree on the v2.32 values. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08EVIDENCE_SUMMARY: add 6th validation (perturbation correlation triangulation)YurenHao0426
2026-04-08EVIDENCE_SUMMARY: §4 fully rewritten under locked two-distinct-modes framingYurenHao0426
§4 now reflects all 5 independent validations of the converged framing: 1. Direct deep cos on penalized DFA (3 seeds): +0.155 ± 0.025 2. Null calibration with fresh Bs: +0.002 ± 0.022 (real signal) 3. Hypothesis B disambiguation (vanilla early ep): -0.008 ± 0.013 4. BP+penalty 2×2 control: 17 pp residual = credit quality 5. Multi-seed lock-in: 24 measurements all near zero Round 20 language tightening applied: - 'lower bound on non-capacity gap' instead of 'clean isolation' - Explicit caveats about end-to-end vs local-loss difference - Counter to 'different optimization regime' objection The §4 framing is locked. Five independent validations done. Stop iterating, start writing.
2026-04-08EVIDENCE_SUMMARY: add (d) threshold sensitivity finding (round 18)YurenHao0426
2026-04-07EVIDENCE_SUMMARY: round 18 language softening on CNN + penalty auditYurenHao0426
2026-04-07EVIDENCE_SUMMARY: add §3.7 CNN cross-architecture audit resultsYurenHao0426
2026-04-07EVIDENCE_SUMMARY: add §3.5 sensitivity, §3.6 cross-width, §4 ↵YurenHao0426
separability, figures section
2026-04-07Add EVIDENCE_SUMMARY.md: consolidated snapshot of all protocol evidenceYurenHao0426
Single-document overview of every result the protocol package has produced so far, with reproducibility commands and the file/memory entry where each result is recorded. Organized by paper section (§1 protocol, §2 audit, §3 decision utility, §4 temporal validation, §5 pitfalls). Includes the headline tables (3-seed audit, cross-architecture, penalty sweep) ready for the paper, and an explicit status field for each ongoing experiment. This is a reading guide for anyone (codex, future-me, the user) who needs to know what evidence is ready and how to reproduce it.