| Age | Commit message (Collapse) | Author |
|
NOTE.md: added comprehensive current-status section at the top with
the full 6-method audit table (BP/FA/EP/DFA/CB/SB), FA vs DFA key
comparison, depth sweep, penalty rescue comparison, cross-method
functional triangulation, and open items. Old Phase 10A content kept
below as historical reference.
EVIDENCE_SUMMARY.md: added "Vanilla FA vs DFA" section with the
paper-changing finding (FA 0.401 ± 0.009 vs DFA 0.306 ± 0.008,
FA has genuine deep cos +0.33, no Mode 1(b) collapse) and the
d=512 depth sweep table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:
BP no-pen 30ep: 0.609 → 0.585 ± 0.001
BP+pen 30ep: 0.530 → 0.532 ± 0.006
DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
DFA+pen 30ep: 0.363 → 0.360 ± 0.001
Gap math: +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
Deep cos: +0.155 → +0.151
Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
|
|
|
§4 now reflects all 5 independent validations of the converged framing:
1. Direct deep cos on penalized DFA (3 seeds): +0.155 ± 0.025
2. Null calibration with fresh Bs: +0.002 ± 0.022 (real signal)
3. Hypothesis B disambiguation (vanilla early ep): -0.008 ± 0.013
4. BP+penalty 2×2 control: 17 pp residual = credit quality
5. Multi-seed lock-in: 24 measurements all near zero
Round 20 language tightening applied:
- 'lower bound on non-capacity gap' instead of 'clean isolation'
- Explicit caveats about end-to-end vs local-loss difference
- Counter to 'different optimization regime' objection
The §4 framing is locked. Five independent validations done. Stop
iterating, start writing.
|
|
|
|
|
|
|
|
separability, figures section
|
|
Single-document overview of every result the protocol package has
produced so far, with reproducibility commands and the file/memory entry
where each result is recorded. Organized by paper section (§1 protocol,
§2 audit, §3 decision utility, §4 temporal validation, §5 pitfalls).
Includes the headline tables (3-seed audit, cross-architecture, penalty
sweep) ready for the paper, and an explicit status field for each
ongoing experiment.
This is a reading guide for anyone (codex, future-me, the user) who
needs to know what evidence is ready and how to reproduce it.
|