<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/results/protocol_audit/audit_table_s42_s123_s456.json, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Audit table extension to 3 seeds (s42/s123/s456)</title>
<updated>2026-04-08T03:45:41+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T03:45:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=3a520b203f4f0c75b37b2d5c34d461718729ea02'/>
<id>3a520b203f4f0c75b37b2d5c34d461718729ea02</id>
<content type='text'>
3 seeds × 5 methods × 4 diagnostics = 60 measurements. Key reproducibility
findings:

  - BP: trustworthy on all 3 seeds (acc 0.61-0.62, h_L ~200, g_L ~3-4e-4)
  - EP: trustworthy on all 3 seeds (acc 0.29-0.36, h_L 3-8e3, g_L ~1e-4)
  - DFA, SB, CB: walked back on all 3 seeds × all 3 of (a)/(b)/(d)

Diagnostic (c) is bimodal across seeds — confirms the prior memory finding:
  - DFA s42=0.047 (noise), s123=0.436 (drift), s456=-0.005 (noise)
  - SB  s42=0.992 (drift), s123=0.561 (drift), s456=0.035 (noise)
  - CB  s42=0.352 (drift), s123=0.250 (~edge), s456=0.518 (drift)

(c) catches different methods on different seeds. (a)/(b)/(d) catch all 3
failing methods on all 3 seeds — robust binary detection.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
3 seeds × 5 methods × 4 diagnostics = 60 measurements. Key reproducibility
findings:

  - BP: trustworthy on all 3 seeds (acc 0.61-0.62, h_L ~200, g_L ~3-4e-4)
  - EP: trustworthy on all 3 seeds (acc 0.29-0.36, h_L 3-8e3, g_L ~1e-4)
  - DFA, SB, CB: walked back on all 3 seeds × all 3 of (a)/(b)/(d)

Diagnostic (c) is bimodal across seeds — confirms the prior memory finding:
  - DFA s42=0.047 (noise), s123=0.436 (drift), s456=-0.005 (noise)
  - SB  s42=0.992 (drift), s123=0.561 (drift), s456=0.035 (noise)
  - CB  s42=0.352 (drift), s123=0.250 (~edge), s456=0.518 (drift)

(c) catches different methods on different seeds. (a)/(b)/(d) catch all 3
failing methods on all 3 seeds — robust binary detection.
</pre>
</div>
</content>
</entry>
</feed>
