faeval.git/protocol/EVIDENCE_SUMMARY.md, branch master

Update NOTE.md + EVIDENCE_SUMMARY.md with FA results (2026-04-23)

2026-04-23T16:18:59+00:00

NOTE.md: added comprehensive current-status section at the top with
the full 6-method audit table (BP/FA/EP/DFA/CB/SB), FA vs DFA key
comparison, depth sweep, penalty rescue comparison, cross-method
functional triangulation, and open items. Old Phase 10A content kept
below as historical reference.

EVIDENCE_SUMMARY.md: added "Vanilla FA vs DFA" section with the
paper-changing finding (FA 0.401 ± 0.009 vs DFA 0.306 ± 0.008,
FA has genuine deep cos +0.33, no Mode 1(b) collapse) and the
d=512 depth sweep table.

Co-Authored-By: Claude Opus 4.6 (1M context)

Sync EVIDENCE_SUMMARY.md and PAPER_OUTLINE.md with v2.32 values

2026-04-09T00:25:42+00:00

These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:

  BP no-pen 30ep:  0.609 → 0.585 ± 0.001
  BP+pen 30ep:     0.530 → 0.532 ± 0.006
  DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
  DFA+pen 30ep:    0.363 → 0.360 ± 0.001
  Gap math:        +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
  Deep cos:        +0.155 → +0.151

Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.

Co-Authored-By: Claude Opus 4.6 (1M context)

EVIDENCE_SUMMARY: add 6th validation (perturbation correlation triangulation)

2026-04-08T07:09:15+00:00

EVIDENCE_SUMMARY: §4 fully rewritten under locked two-distinct-modes framing

2026-04-08T07:04:28+00:00

§4 now reflects all 5 independent validations of the converged framing:
  1. Direct deep cos on penalized DFA (3 seeds): +0.155 ± 0.025
  2. Null calibration with fresh Bs: +0.002 ± 0.022 (real signal)
  3. Hypothesis B disambiguation (vanilla early ep): -0.008 ± 0.013
  4. BP+penalty 2×2 control: 17 pp residual = credit quality
  5. Multi-seed lock-in: 24 measurements all near zero

Round 20 language tightening applied:
  - 'lower bound on non-capacity gap' instead of 'clean isolation'
  - Explicit caveats about end-to-end vs local-loss difference
  - Counter to 'different optimization regime' objection

The §4 framing is locked. Five independent validations done. Stop
iterating, start writing.

EVIDENCE_SUMMARY: add (d) threshold sensitivity finding (round 18)

2026-04-08T05:00:34+00:00

EVIDENCE_SUMMARY: round 18 language softening on CNN + penalty audit

2026-04-08T04:56:33+00:00

EVIDENCE_SUMMARY: add §3.7 CNN cross-architecture audit results

2026-04-08T04:44:58+00:00

EVIDENCE_SUMMARY: add §3.5 sensitivity, §3.6 cross-width, §4 separability, figures section

2026-04-08T04:30:03+00:00

Add EVIDENCE_SUMMARY.md: consolidated snapshot of all protocol evidence

2026-04-08T04:15:47+00:00

Single-document overview of every result the protocol package has
produced so far, with reproducibility commands and the file/memory entry
where each result is recorded. Organized by paper section (§1 protocol,
§2 audit, §3 decision utility, §4 temporal validation, §5 pitfalls).

Includes the headline tables (3-seed audit, cross-architecture, penalty
sweep) ready for the paper, and an explicit status field for each
ongoing experiment.

This is a reading guide for anyone (codex, future-me, the user) who
needs to know what evidence is ready and how to reproduce it.