<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/results/dfa_no_penalty_30ep/results_cifar10.json, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>paper v2.31: matched 30-epoch BP/DFA controls (was unsourced 0.609/0.308)</title>
<updated>2026-04-08T23:03:16+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T23:03:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=752dfb833b06a6fb974df892de560caf328ed1dd'/>
<id>752dfb833b06a6fb974df892de560caf328ed1dd</id>
<content type='text'>
The §5 ¶3 BP-no-penalty value of 0.609 ± 0.004 and DFA-no-penalty value
of 0.308 ± 0.014 turned out to be unsourced — they were carried over
from a hardcoded comment in experiments/bp_with_penalty_control.py
("BP-trainable (3-seed mean): 0.609") that nobody had actually measured
with a matched 30-epoch run.

Ran the missing matched controls under the same recipe as BP+pen
(lam=0, 30 epochs, AdamW 1e-3, wd 0.01, cosine schedule, batch 128,
3 seeds 42/123/456):

  BP no-pen 30ep: per-seed 0.5851, 0.5845, 0.5863  →  0.585 ± 0.001
                  (paper said 0.609 ± 0.004, off by 0.024)
  DFA no-pen 30ep: per-seed 0.3070, 0.2985, 0.2966 →  0.301 ± 0.005
                  (paper said 0.308 ± 0.014)

Also re-grounded DFA+penalty 30ep using the dfa_pen_short 3-seed run
(0.3593, 0.3610, 0.3604 → 0.360 ± 0.001), which is what the deep-cosine
+0.155 figure was computed on. The paper had 0.363 ± 0.001 — that came
from the 100-epoch run, not the 30-epoch run, so it was an apples-to-
oranges comparison with BP+pen 30-ep.

Paper changes (§5 ¶3):
  BP penalty cost:  -8 pp  →  -5.5 pp
  DFA pen rescue:   +5.5 → +5.9 pp
  DFA+pen margin vs frozen: +1.4 → +1.1 pp
  BP-to-DFA gap:     17 → 17.0 pp (unchanged)
  BP-to-SB gap:      7.7 → 7.7 pp (unchanged)
  BP-to-DFA gap is still the lower-bound credit-quality cost claim;
  17 pp gap is unchanged in magnitude.

Also updated:
- §5 ¶1 prose: 0.363 → 0.360, 0.308 → 0.301
- §4 ¶4 prose: DFA+pen 0.363 → 0.360
- Appendix J Table 9 caption: 0.363 → 0.360, +9.0 → +9.3 pp gap to SB
- Appendix L paragraph: +5.5 → +5.9 pp DFA penalty rescue
- Figure 3 panel C bar values + title pen-cost annotation
- New results/matched_30ep_control_summary.json as auditable record

Page layout preserved: 9 main pages + refs p10, 18 total, 0 overfull.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
The §5 ¶3 BP-no-penalty value of 0.609 ± 0.004 and DFA-no-penalty value
of 0.308 ± 0.014 turned out to be unsourced — they were carried over
from a hardcoded comment in experiments/bp_with_penalty_control.py
("BP-trainable (3-seed mean): 0.609") that nobody had actually measured
with a matched 30-epoch run.

Ran the missing matched controls under the same recipe as BP+pen
(lam=0, 30 epochs, AdamW 1e-3, wd 0.01, cosine schedule, batch 128,
3 seeds 42/123/456):

  BP no-pen 30ep: per-seed 0.5851, 0.5845, 0.5863  →  0.585 ± 0.001
                  (paper said 0.609 ± 0.004, off by 0.024)
  DFA no-pen 30ep: per-seed 0.3070, 0.2985, 0.2966 →  0.301 ± 0.005
                  (paper said 0.308 ± 0.014)

Also re-grounded DFA+penalty 30ep using the dfa_pen_short 3-seed run
(0.3593, 0.3610, 0.3604 → 0.360 ± 0.001), which is what the deep-cosine
+0.155 figure was computed on. The paper had 0.363 ± 0.001 — that came
from the 100-epoch run, not the 30-epoch run, so it was an apples-to-
oranges comparison with BP+pen 30-ep.

Paper changes (§5 ¶3):
  BP penalty cost:  -8 pp  →  -5.5 pp
  DFA pen rescue:   +5.5 → +5.9 pp
  DFA+pen margin vs frozen: +1.4 → +1.1 pp
  BP-to-DFA gap:     17 → 17.0 pp (unchanged)
  BP-to-SB gap:      7.7 → 7.7 pp (unchanged)
  BP-to-DFA gap is still the lower-bound credit-quality cost claim;
  17 pp gap is unchanged in magnitude.

Also updated:
- §5 ¶1 prose: 0.363 → 0.360, 0.308 → 0.301
- §4 ¶4 prose: DFA+pen 0.363 → 0.360
- Appendix J Table 9 caption: 0.363 → 0.360, +9.0 → +9.3 pp gap to SB
- Appendix L paragraph: +5.5 → +5.9 pp DFA penalty rescue
- Figure 3 panel C bar values + title pen-cost annotation
- New results/matched_30ep_control_summary.json as auditable record

Page layout preserved: 9 main pages + refs p10, 18 total, 0 overfull.

Co-Authored-By: Claude Opus 4.6 (1M context) &lt;noreply@anthropic.com&gt;
</pre>
</div>
</content>
</entry>
</feed>
