<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/protocol/PAPER_OUTLINE.md, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Sync EVIDENCE_SUMMARY.md and PAPER_OUTLINE.md with v2.32 values</title>
<updated>2026-04-09T00:25:42+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-09T00:25:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=a0b8169afb7981921e6599f2bc33a35a0ab9ca53'/>
<id>a0b8169afb7981921e6599f2bc33a35a0ab9ca53</id>
<content type='text'>
These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:

  BP no-pen 30ep:  0.609 → 0.585 ± 0.001
  BP+pen 30ep:     0.530 → 0.532 ± 0.006
  DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
  DFA+pen 30ep:    0.363 → 0.360 ± 0.001
  Gap math:        +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
  Deep cos:        +0.155 → +0.151

Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
These two project scratch documents had stale BP=0.609 and DFA=0.308
references from the pre-v2.31 era. Updated to the matched 30-ep 3-seed
values that v2.31-v2.32 corrected:

  BP no-pen 30ep:  0.609 → 0.585 ± 0.001
  BP+pen 30ep:     0.530 → 0.532 ± 0.006
  DFA no-pen 30ep: 0.308 → 0.301 ± 0.005
  DFA+pen 30ep:    0.363 → 0.360 ± 0.001
  Gap math:        +5.5/-8 → +5.9/-5.3 pp; +18.1/+1.4 → +18.3/+1.1 pp
  Deep cos:        +0.155 → +0.151

Now the paper, the protocol library, the README, the helper scripts,
and the project scratch docs all agree on the v2.32 values.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>PAPER_OUTLINE: add 6th validation (perturbation correlation triangulation)</title>
<updated>2026-04-08T07:12:08+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:12:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=1e342e28582e46d2fff969c77b3c2b78e4007491'/>
<id>1e342e28582e46d2fff969c77b3c2b78e4007491</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>PAPER_OUTLINE: round 20 language tightening + 5 validation summary</title>
<updated>2026-04-08T07:02:37+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:02:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=78bd7ad68c174362e944c2b598beb859c2952c0b'/>
<id>78bd7ad68c174362e944c2b598beb859c2952c0b</id>
<content type='text'>
§4 updates per round 20:
  - Soften 'confirmed' to 'strongly supports'
  - Add §4.4 BP+penalty capacity-cost control with the round 20 phrasing:
    'lower bound on residual gap under matched architecture/data/optimizer/
    penalty, after accounting for the penalty's direct capacity cost in BP'
  - Add multi-seed lock-in to §4.3 (24 measurements all near zero)
  - List 5 independent validations supporting the converged framing

The §4 narrative is now complete and the framing is locked.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
§4 updates per round 20:
  - Soften 'confirmed' to 'strongly supports'
  - Add §4.4 BP+penalty capacity-cost control with the round 20 phrasing:
    'lower bound on residual gap under matched architecture/data/optimizer/
    penalty, after accounting for the penalty's direct capacity cost in BP'
  - Add multi-seed lock-in to §4.3 (24 measurements all near zero)
  - List 5 independent validations supporting the converged framing

The §4 narrative is now complete and the framing is locked.
</pre>
</div>
</content>
</entry>
<entry>
<title>PAPER_OUTLINE: §4 rewrite under 'two distinct failure modes' framing</title>
<updated>2026-04-08T06:33:00+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T06:33:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=2ca87f2bd4449b1d4ac715d8cf4fb5f20b7afdd8'/>
<id>2ca87f2bd4449b1d4ac715d8cf4fb5f20b7afdd8</id>
<content type='text'>
After the round 19 disambiguation experiment confirmed hypothesis B
(penalty CREATES deep alignment, not just reveals it), the paper §4
needs to use the new framing:

  Mode 1: measurement degeneracy via terminal LN gradient cancellation
  Mode 2: low intrinsic credit-direction quality of random feedback

Both modes are direct-measured (mode 1 by diagnostic (b), mode 2 by
per-layer cos in the meaningful regime). The penalty partially
alleviates BOTH modes. Neither is fully fixed.

§4 rewrite includes:
  - The two modes (4.1)
  - Penalty causal validation with 3-seed cos (4.2)
  - Disambiguation: vanilla early-epoch cos table proving hypothesis B (4.3)
  - Why the residual gap is partial alignment (4.4)
  - Why this framing is paper-cleaner than prior ones (4.5)

Walk-back chain extended to 7 entries, with 6 and 7 happening same-day
and converging on the final two-distinct-modes framing.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
After the round 19 disambiguation experiment confirmed hypothesis B
(penalty CREATES deep alignment, not just reveals it), the paper §4
needs to use the new framing:

  Mode 1: measurement degeneracy via terminal LN gradient cancellation
  Mode 2: low intrinsic credit-direction quality of random feedback

Both modes are direct-measured (mode 1 by diagnostic (b), mode 2 by
per-layer cos in the meaningful regime). The penalty partially
alleviates BOTH modes. Neither is fully fixed.

§4 rewrite includes:
  - The two modes (4.1)
  - Penalty causal validation with 3-seed cos (4.2)
  - Disambiguation: vanilla early-epoch cos table proving hypothesis B (4.3)
  - Why the residual gap is partial alignment (4.4)
  - Why this framing is paper-cleaner than prior ones (4.5)

Walk-back chain extended to 7 entries, with 6 and 7 happening same-day
and converging on the final two-distinct-modes framing.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add PAPER_OUTLINE.md: §1-§6 draft reflecting round 17 + 18</title>
<updated>2026-04-08T05:12:27+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T05:12:27+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=5b7f83ae5240c78013c084cf2e24ce5a5f572c42'/>
<id>5b7f83ae5240c78013c084cf2e24ce5a5f572c42</id>
<content type='text'>
Comprehensive paper draft outline for the NeurIPS 2026 E&amp;D submission:

§1 Discovery-first hook (round 16 narrative arc): broken eval -&gt; evidence
   -&gt; metrics miss -&gt; need protocol -&gt; validation
§2 Audit findings: 5-method × 3-seed audit, walk-back details, EP internal
   control
§3 The diagnostic protocol: 4 diagnostics, decision-utility ablation,
   threshold sensitivity (with (d) fragility flagged), temporal validation,
   cross-architecture validation, sub-mode discrimination
§4 Two failure modes: mechanism story + causal penalty rescue, with the
   round 18 softening (partial dissociation rather than full separability)
§5 Pipeline pitfalls catalog: 7 bugs (incl. new #6.5 self-cosine fallback)
§6 Reference implementation
+ Limitations / walk-backs section listing all 5 walked-back claims explicitly

This is a working draft to make the next writing step concrete. Reflects
all evidence collected through the round 18 follow-up.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Comprehensive paper draft outline for the NeurIPS 2026 E&amp;D submission:

§1 Discovery-first hook (round 16 narrative arc): broken eval -&gt; evidence
   -&gt; metrics miss -&gt; need protocol -&gt; validation
§2 Audit findings: 5-method × 3-seed audit, walk-back details, EP internal
   control
§3 The diagnostic protocol: 4 diagnostics, decision-utility ablation,
   threshold sensitivity (with (d) fragility flagged), temporal validation,
   cross-architecture validation, sub-mode discrimination
§4 Two failure modes: mechanism story + causal penalty rescue, with the
   round 18 softening (partial dissociation rather than full separability)
§5 Pipeline pitfalls catalog: 7 bugs (incl. new #6.5 self-cosine fallback)
§6 Reference implementation
+ Limitations / walk-backs section listing all 5 walked-back claims explicitly

This is a working draft to make the next writing step concrete. Reflects
all evidence collected through the round 18 follow-up.
</pre>
</div>
</content>
</entry>
</feed>
