<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/protocol/EVIDENCE_SUMMARY.md, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Update NOTE.md + EVIDENCE_SUMMARY.md with FA results (2026-04-23)</title>
<updated>2026-04-23T16:18:59+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-23T16:18:59+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=5937af903fdcb473cb3dd39cd3d0a86c1dbe0a05'/>
<id>5937af903fdcb473cb3dd39cd3d0a86c1dbe0a05</id>
<content type='text'>
NOTE.md: added comprehensive current-status section at the top with
the full 6-method audit table (BP/FA/EP/DFA/CB/SB), FA vs DFA key
comparison, depth sweep, penalty rescue comparison, cross-method
functional triangulation, and open items. Old Phase 10A content kept
below as historical reference.

EVIDENCE_SUMMARY.md: added "Vanilla FA vs DFA" section with the
paper-changing finding (FA 0.401 ± 0.009 vs DFA 0.306 ± 0.008,
FA has genuine deep cos +0.33, no Mode 1(b) collapse) and the
d=512 depth sweep table.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
NOTE.md: added comprehensive current-status section at the top with
the full 6-method audit table (BP/FA/EP/DFA/CB/SB), FA vs DFA key
comparison, depth sweep, penalty rescue comparison, cross-method
functional triangulation, and open items. Old Phase 10A content kept
below as historical reference.

EVIDENCE_SUMMARY.md: added "Vanilla FA vs DFA" section with the
paper-changing finding (FA 0.401 ± 0.009 vs DFA 0.306 ± 0.008,
FA has genuine deep cos +0.33, no Mode 1(b) collapse) and the
d=512 depth sweep table.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Sync EVIDENCE_SUMMARY.md and PAPER_OUTLINE.md with v2.32 values</title>
<updated>2026-04-09T00:25:42+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-09T00:25:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=a0b8169afb7981921e6599f2bc33a35a0ab9ca53'/>
<id>a0b8169afb7981921e6599f2bc33a35a0ab9ca53</id>
<content type='text'>
These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:

  BP no-pen 30ep:  0.609 → 0.585 ± 0.001
  BP+pen 30ep:     0.530 → 0.532 ± 0.006
  DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
  DFA+pen 30ep:    0.363 → 0.360 ± 0.001
  Gap math:        +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
  Deep cos:        +0.155 → +0.151

Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:

  BP no-pen 30ep:  0.609 → 0.585 ± 0.001
  BP+pen 30ep:     0.530 → 0.532 ± 0.006
  DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
  DFA+pen 30ep:    0.363 → 0.360 ± 0.001
  Gap math:        +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
  Deep cos:        +0.155 → +0.151

Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: add 6th validation (perturbation correlation triangulation)</title>
<updated>2026-04-08T07:09:15+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:09:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=4bee0a6d80f2937473837897e80dfd4d697b644b'/>
<id>4bee0a6d80f2937473837897e80dfd4d697b644b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: §4 fully rewritten under locked two-distinct-modes framing</title>
<updated>2026-04-08T07:04:28+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:04:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=8bf53ab94ac31c7672d23e2edf0e40c787b157d4'/>
<id>8bf53ab94ac31c7672d23e2edf0e40c787b157d4</id>
<content type='text'>
§4 now reflects all 5 independent validations of the converged framing:
  1. Direct deep cos on penalized DFA (3 seeds): +0.155 ± 0.025
  2. Null calibration with fresh Bs: +0.002 ± 0.022 (real signal)
  3. Hypothesis B disambiguation (vanilla early ep): -0.008 ± 0.013
  4. BP+penalty 2×2 control: 17 pp residual = credit quality
  5. Multi-seed lock-in: 24 measurements all near zero

Round 20 language tightening applied:
  - 'lower bound on non-capacity gap' instead of 'clean isolation'
  - Explicit caveats about end-to-end vs local-loss difference
  - Counter to 'different optimization regime' objection

The §4 framing is locked. Five independent validations done. Stop
iterating, start writing.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
§4 now reflects all 5 independent validations of the converged framing:
  1. Direct deep cos on penalized DFA (3 seeds): +0.155 ± 0.025
  2. Null calibration with fresh Bs: +0.002 ± 0.022 (real signal)
  3. Hypothesis B disambiguation (vanilla early ep): -0.008 ± 0.013
  4. BP+penalty 2×2 control: 17 pp residual = credit quality
  5. Multi-seed lock-in: 24 measurements all near zero

Round 20 language tightening applied:
  - 'lower bound on non-capacity gap' instead of 'clean isolation'
  - Explicit caveats about end-to-end vs local-loss difference
  - Counter to 'different optimization regime' objection

The §4 framing is locked. Five independent validations done. Stop
iterating, start writing.
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: add (d) threshold sensitivity finding (round 18)</title>
<updated>2026-04-08T05:00:34+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T05:00:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=9d0e4901a82763ea3ebc57eea152a730330d4991'/>
<id>9d0e4901a82763ea3ebc57eea152a730330d4991</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: round 18 language softening on CNN + penalty audit</title>
<updated>2026-04-08T04:56:33+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:56:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=cbe851cf382a2af13037304afdd783214bad5c6b'/>
<id>cbe851cf382a2af13037304afdd783214bad5c6b</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: add §3.7 CNN cross-architecture audit results</title>
<updated>2026-04-08T04:44:58+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:44:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=05cd478cb45f78ccf89ab42918df9010cd534ede'/>
<id>05cd478cb45f78ccf89ab42918df9010cd534ede</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>EVIDENCE_SUMMARY: add §3.5 sensitivity, §3.6 cross-width, §4 separability, figures section</title>
<updated>2026-04-08T04:30:03+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:30:03+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=cb0e6b3f3e9c3d0cb8335be1621478cf4c786375'/>
<id>cb0e6b3f3e9c3d0cb8335be1621478cf4c786375</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>Add EVIDENCE_SUMMARY.md: consolidated snapshot of all protocol evidence</title>
<updated>2026-04-08T04:15:47+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:15:47+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=5dadf7b78cbd3332b48a3ec0c385e3aeaea253a6'/>
<id>5dadf7b78cbd3332b48a3ec0c385e3aeaea253a6</id>
<content type='text'>
Single-document overview of every result the protocol package has
produced so far, with reproducibility commands and the file/memory entry
where each result is recorded. Organized by paper section (§1 protocol,
§2 audit, §3 decision utility, §4 temporal validation, §5 pitfalls).

Includes the headline tables (3-seed audit, cross-architecture, penalty
sweep) ready for the paper, and an explicit status field for each
ongoing experiment.

This is a reading guide for anyone (codex, future-me, the user) who
needs to know what evidence is ready and how to reproduce it.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Single-document overview of every result the protocol package has
produced so far, with reproducibility commands and the file/memory entry
where each result is recorded. Organized by paper section (§1 protocol,
§2 audit, §3 decision utility, §4 temporal validation, §5 pitfalls).

Includes the headline tables (3-seed audit, cross-architecture, penalty
sweep) ready for the paper, and an explicit status field for each
ongoing experiment.

This is a reading guide for anyone (codex, future-me, the user) who
needs to know what evidence is ready and how to reproduce it.
</pre>
</div>
</content>
</entry>
</feed>
