<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/protocol/examples/audit_cnn.py, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Add CNN third-architecture audit: BN, no terminal LN</title>
<updated>2026-04-08T04:42:06+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T04:42:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=4cd716757b50a1f4217a3ffdf8ee624c270b7a23'/>
<id>4cd716757b50a1f4217a3ffdf8ee624c270b7a23</id>
<content type='text'>
5 methods × 3 seeds on the SmallCNN (3 conv + BN + 1 FC + head, no
terminal LN) using existing checkpoints in results/cnn_baseline/.

Key findings:

  BP CNN:           0.866 acc, max/block 1.3, trustworthy
  State Bridge CNN: 0.633 acc, max/block 2.4, trustworthy
  EP CNN:           0.512 acc, max/block 12, trustworthy
  DFA CNN:          0.566 acc, max/block 237, walked back via (a)
  Credit Bridge CNN: 0.325 acc, max/block 96, walked back via (a)

CRITICAL: diagnostic (b) ||g_L|| floor NEVER fires on CNN for any method.
The deepest BP grad is at ~1e-5 to 6e-1, all well above the 1e-7 floor.

This is the cleanest confirmation that terminal LayerNorm is the
structural cause of the catastrophic gradient collapse in (b). Without
out_ln, the BP grad does NOT collapse to the floor, even on DFA. The
scale pathology (a) still appears on DFA and CB, but the gradient
collapse pathology (b) is specific to terminal-LN architectures.

DFA CNN's accuracy (56.6%) is much higher than DFA ResMLP (30.8%) or
DFA ViT (23.7%) — partially because the scale pathology is less
catastrophic without the LN-driven gradient cancellation amplifying
it. This is the cross-architecture mechanism story made concrete.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
5 methods × 3 seeds on the SmallCNN (3 conv + BN + 1 FC + head, no
terminal LN) using existing checkpoints in results/cnn_baseline/.

Key findings:

  BP CNN:           0.866 acc, max/block 1.3, trustworthy
  State Bridge CNN: 0.633 acc, max/block 2.4, trustworthy
  EP CNN:           0.512 acc, max/block 12, trustworthy
  DFA CNN:          0.566 acc, max/block 237, walked back via (a)
  Credit Bridge CNN: 0.325 acc, max/block 96, walked back via (a)

CRITICAL: diagnostic (b) ||g_L|| floor NEVER fires on CNN for any method.
The deepest BP grad is at ~1e-5 to 6e-1, all well above the 1e-7 floor.

This is the cleanest confirmation that terminal LayerNorm is the
structural cause of the catastrophic gradient collapse in (b). Without
out_ln, the BP grad does NOT collapse to the floor, even on DFA. The
scale pathology (a) still appears on DFA and CB, but the gradient
collapse pathology (b) is specific to terminal-LN architectures.

DFA CNN's accuracy (56.6%) is much higher than DFA ResMLP (30.8%) or
DFA ViT (23.7%) — partially because the scale pathology is less
catastrophic without the LN-driven gradient cancellation amplifying
it. This is the cross-architecture mechanism story made concrete.
</pre>
</div>
</content>
</entry>
</feed>
