diff options
| author | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 02:11:00 -0500 |
|---|---|---|
| committer | YurenHao0426 <Blackhao0426@gmail.com> | 2026-04-08 02:11:00 -0500 |
| commit | d3df5628b570af8fe2e22644b9c0849f69b9f3a1 (patch) | |
| tree | d2e579e6effe7ff66b8e6e02e2d03d5e81d10716 /experiments | |
| parent | 4bee0a6d80f2937473837897e80dfd4d697b644b (diff) | |
Extend perturbation audit to vanilla early-epoch checkpoints
Cross-metric disambiguation confirmation. Vanilla DFA at ep 1
(meaningful regime, ||g||~6e-7) deep rho across 3 seeds:
s42: deep rho -0.008
s123: deep rho +0.000
s456: deep rho -0.000
mean: -0.003 ± 0.005
Compare to penalized DFA 3-seed: deep rho +0.080 ± 0.011.
The disambiguation (penalty CREATES alignment, not just reveals it) is
now confirmed by TWO independent metrics:
- cos: vanilla -0.008 ± 0.013, penalized +0.155 ± 0.025
- rho: vanilla -0.003 ± 0.005, penalized +0.080 ± 0.011
Both metrics agree on the vanilla→penalized transition. The l0 (embedding)
rho is high (~0.25-0.29) at every vanilla checkpoint, mirroring the cos
l0 +0.42 — the embedding layer is genuinely useful while the deep blocks
are not, by BOTH metrics. The penalty restores some deep usefulness to
~+0.08 rho / +0.16 cos.
Cross-metric agreement rules out single-metric artifacts on either side.
Diffstat (limited to 'experiments')
| -rw-r--r-- | experiments/perturbation_correlation_audit.py | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/experiments/perturbation_correlation_audit.py b/experiments/perturbation_correlation_audit.py index cba84ea..163d3a8 100644 --- a/experiments/perturbation_correlation_audit.py +++ b/experiments/perturbation_correlation_audit.py @@ -127,6 +127,12 @@ def main(): "results/dfa_pen_short/dfa_pen_lam0.01_s123.pt", 123), ("penalized DFA s456 lam=1e-2 30ep", "results/dfa_pen_short/dfa_pen_lam0.01_s456.pt", 456), + ("vanilla DFA s42 ep1 (meaningful regime)", + "results/vanilla_dfa_early_ckpts/vanilla_dfa_s42_ep1.pt", 42), + ("vanilla DFA s123 ep1 (meaningful regime)", + "results/vanilla_dfa_early_ckpts/vanilla_dfa_s123_ep1.pt", 123), + ("vanilla DFA s456 ep1 (meaningful regime)", + "results/vanilla_dfa_early_ckpts/vanilla_dfa_s456_ep1.pt", 456), ] print("=" * 76) |
