<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/protocol/README.md, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>protocol/README.md: sync (c) range with v2.31.13 paper update</title>
<updated>2026-04-09T00:22:38+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-09T00:22:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=cebc4c4a81809a982a16dd07da41487aa2f30322'/>
<id>cebc4c4a81809a982a16dd07da41487aa2f30322</id>
<content type='text'>
Same fix as v2.31.13's paper §6 ¶3 and the protocol.py docstring sync:
the README's "0.05-0.18 / 0.43-0.99" calibration ranges were the
same loose values that v2.31.13 corrected. Updated to match the actual
audit data: BP/EP in [-0.04, +0.12], degenerate up to +0.99 with 5/9
above the 0.30 cutoff.

Now the paper §6 ¶3, protocol.py docstring, and protocol/README.md
all agree on the (c) calibration ranges.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Same fix as v2.31.13's paper §6 ¶3 and the protocol.py docstring sync:
the README's "0.05-0.18 / 0.43-0.99" calibration ranges were the
same loose values that v2.31.13 corrected. Updated to match the actual
audit data: BP/EP in [-0.04, +0.12], degenerate up to +0.99 with 5/9
above the 0.30 cutoff.

Now the paper §6 ¶3, protocol.py docstring, and protocol/README.md
all agree on the (c) calibration ranges.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
<entry>
<title>Add FA diagnostic protocol reference implementation</title>
<updated>2026-04-08T03:20:48+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T03:20:48+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=7b64702ad970c16171142665365e16a8e1737190'/>
<id>7b64702ad970c16171142665365e16a8e1737190</id>
<content type='text'>
Codex round 15 #1 priority for the E&amp;D-track paper:
  - protocol/protocol.py: 4 diagnostics (residual norms, BP grad norms,
    cross-batch direction stability, and a frozen-baseline comparator)
  - protocol/report.py: DiagnosticReport with per-diagnostic verdicts and
    pretty-printer
  - protocol/smoke_test.py: validates BP/DFA/EP checkpoints produce the
    expected verdicts (BP/EP trustworthy; DFA walked back via residual
    explosion + BP grad at floor)
  - protocol/README.md: usage, audit cases, threshold rationale
  - protocol/CHECKLIST.md: 6 evaluation pipeline pitfalls (norm(-1),
    cosine_similarity eps clamp, fp16 underflow, Bs reproducibility,
    aggregation, layer-0 dominance)
  - protocol/REPORTING_TEMPLATE.md: per-method fillable form for FA papers
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Codex round 15 #1 priority for the E&amp;D-track paper:
  - protocol/protocol.py: 4 diagnostics (residual norms, BP grad norms,
    cross-batch direction stability, and a frozen-baseline comparator)
  - protocol/report.py: DiagnosticReport with per-diagnostic verdicts and
    pretty-printer
  - protocol/smoke_test.py: validates BP/DFA/EP checkpoints produce the
    expected verdicts (BP/EP trustworthy; DFA walked back via residual
    explosion + BP grad at floor)
  - protocol/README.md: usage, audit cases, threshold rationale
  - protocol/CHECKLIST.md: 6 evaluation pipeline pitfalls (norm(-1),
    cosine_similarity eps clamp, fp16 underflow, Bs reproducibility,
    aggregation, layer-0 dominance)
  - protocol/REPORTING_TEMPLATE.md: per-method fillable form for FA papers
</pre>
</div>
</content>
</entry>
</feed>
