<feed xmlns='http://www.w3.org/2005/Atom'>
<title>faeval.git/experiments/perturbation_correlation_audit.py, branch master</title>
<subtitle>Unnamed repository; edit this file 'description' to name the repository.
</subtitle>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/'/>
<entry>
<title>Extend perturbation audit to vanilla early-epoch checkpoints</title>
<updated>2026-04-08T07:11:00+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:11:00+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=d3df5628b570af8fe2e22644b9c0849f69b9f3a1'/>
<id>d3df5628b570af8fe2e22644b9c0849f69b9f3a1</id>
<content type='text'>
Cross-metric disambiguation confirmation. Vanilla DFA at ep 1
(meaningful regime, ||g||~6e-7) deep rho across 3 seeds:

  s42:  deep rho -0.008
  s123: deep rho +0.000
  s456: deep rho -0.000
  mean: -0.003 ± 0.005

Compare to penalized DFA 3-seed: deep rho +0.080 ± 0.011.

The disambiguation (penalty CREATES alignment, not just reveals it) is
now confirmed by TWO independent metrics:
  - cos: vanilla -0.008 ± 0.013, penalized +0.155 ± 0.025
  - rho: vanilla -0.003 ± 0.005, penalized +0.080 ± 0.011

Both metrics agree on the vanilla→penalized transition. The l0 (embedding)
rho is high (~0.25-0.29) at every vanilla checkpoint, mirroring the cos
l0 +0.42 — the embedding layer is genuinely useful while the deep blocks
are not, by BOTH metrics. The penalty restores some deep usefulness to
~+0.08 rho / +0.16 cos.

Cross-metric agreement rules out single-metric artifacts on either side.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Cross-metric disambiguation confirmation. Vanilla DFA at ep 1
(meaningful regime, ||g||~6e-7) deep rho across 3 seeds:

  s42:  deep rho -0.008
  s123: deep rho +0.000
  s456: deep rho -0.000
  mean: -0.003 ± 0.005

Compare to penalized DFA 3-seed: deep rho +0.080 ± 0.011.

The disambiguation (penalty CREATES alignment, not just reveals it) is
now confirmed by TWO independent metrics:
  - cos: vanilla -0.008 ± 0.013, penalized +0.155 ± 0.025
  - rho: vanilla -0.003 ± 0.005, penalized +0.080 ± 0.011

Both metrics agree on the vanilla→penalized transition. The l0 (embedding)
rho is high (~0.25-0.29) at every vanilla checkpoint, mirroring the cos
l0 +0.42 — the embedding layer is genuinely useful while the deep blocks
are not, by BOTH metrics. The penalty restores some deep usefulness to
~+0.08 rho / +0.16 cos.

Cross-metric agreement rules out single-metric artifacts on either side.
</pre>
</div>
</content>
</entry>
<entry>
<title>Add perturbation correlation audit (round 19's recommended alt metric)</title>
<updated>2026-04-08T07:07:26+00:00</updated>
<author>
<name>YurenHao0426</name>
<email>Blackhao0426@gmail.com</email>
</author>
<published>2026-04-08T07:07:26+00:00</published>
<link rel='alternate' type='text/html' href='https://git.blackhao.com/faeval.git/commit/?id=a868b29e4c399a3a948e85737e7a632001481969'/>
<id>a868b29e4c399a3a948e85737e7a632001481969</id>
<content type='text'>
Codex round 19 said: 'use nudging or perturbation correlation on the
penalized checkpoints. In the healthy-gradient regime, that is a more
direct is-the-local-signal-useful test than cosine alone'.

Result on existing checkpoints (eps=1e-3, M=32 random directions, n=1024):

  vanilla DFA s42:                    deep rho +0.002
  penalized DFA s42 lam=1e-2 30ep:    deep rho +0.094
  penalized DFA s123 lam=1e-2 30ep:   deep rho +0.073
  penalized DFA s456 lam=1e-2 30ep:   deep rho +0.072
  penalized 3-seed mean:              deep rho +0.080 ± 0.011

This INDEPENDENTLY TRIANGULATES the cos +0.17 finding via a different
metric:
  - vanilla deep cos ~0   matches  vanilla deep rho ~0
  - penalized deep cos +0.155  matches  penalized deep rho +0.080

The two metrics measure different things:
  - cos = directional alignment with BP grad
  - rho = correlation between predicted and true loss change under
    random perturbation

Both show the same pattern: penalty creates partial usefulness from
essentially zero. This is the 6th independent validation of the mode 2
'penalty creates partial alignment' framing.

Crucially, rho doesn't use F.cosine_similarity (no eps clamp), and it
measures sample-level loss change correlation rather than direction
match — so it rules out 'cos is capturing some directional artifact
unrelated to local usefulness'.
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Codex round 19 said: 'use nudging or perturbation correlation on the
penalized checkpoints. In the healthy-gradient regime, that is a more
direct is-the-local-signal-useful test than cosine alone'.

Result on existing checkpoints (eps=1e-3, M=32 random directions, n=1024):

  vanilla DFA s42:                    deep rho +0.002
  penalized DFA s42 lam=1e-2 30ep:    deep rho +0.094
  penalized DFA s123 lam=1e-2 30ep:   deep rho +0.073
  penalized DFA s456 lam=1e-2 30ep:   deep rho +0.072
  penalized 3-seed mean:              deep rho +0.080 ± 0.011

This INDEPENDENTLY TRIANGULATES the cos +0.17 finding via a different
metric:
  - vanilla deep cos ~0   matches  vanilla deep rho ~0
  - penalized deep cos +0.155  matches  penalized deep rho +0.080

The two metrics measure different things:
  - cos = directional alignment with BP grad
  - rho = correlation between predicted and true loss change under
    random perturbation

Both show the same pattern: penalty creates partial usefulness from
essentially zero. This is the 6th independent validation of the mode 2
'penalty creates partial alignment' framing.

Crucially, rho doesn't use F.cosine_similarity (no eps clamp), and it
measures sample-level loss change correlation rather than direction
match — so it rules out 'cos is capturing some directional artifact
unrelated to local usefulness'.
</pre>
</div>
</content>
</entry>
</feed>
