diff options
Diffstat (limited to 'NOTE.md')
| -rw-r--r-- | NOTE.md | 20 |
1 files changed, 19 insertions, 1 deletions
@@ -5,7 +5,7 @@ - **pilot**: Controlled iteration (commits 0b9ebb2, 7baf7ae) - **frozen**: Code at commit 0b9ebb2 for all reported results -## Status: PHASE 10A — NO PREFIT THRESHOLD, BLEND ITSELF IS THE ACTIVE INGREDIENT +## Status: PHASE 10A.5 — BLEND GAIN IS IMPLICIT REGULARIZATION, NOT LEARNED CREDIT --- @@ -570,5 +570,23 @@ The +1.5% gain from 9A's blend(0.75) at t0=5 is the project's best online result Frozen Gamma/rho are near zero at all prefit levels. The benefit comes from the blend mechanism itself — blending DFA with any additional signal provides regularization/diversification. +### Phase 10A.5: Blend Mechanism Dissection + +| Branch | final acc | diff vs DFA | +|--------|-----------|-------------| +| continue_DFA | 0.311 | baseline | +| blend_random_**frozen** | **0.126** | **-18.5%** (catastrophic) | +| blend_random_**trainable** | 0.322 | +1.2% | +| blend_shuffled_trainable | 0.325 | +1.4% | +| blend_gaussian_noise | 0.308 | -0.3% | +| scaled_DFA_norm_match | 0.310 | -0.0% | + +**Mechanism identified**: The gain is from **implicit regularization through a trainable +auxiliary network**, NOT from learned credit. Frozen random Vec crashes (12.6%). +Trainable Vec helps even with shuffled targets. Gaussian noise and norm scaling don't help. + +Phase 9A's +1.5% was not evidence of useful credit — it was an optimization dynamics effect. + ### Experiment IDs (Phase 10) - `prefit_threshold/`: Phase 10A prefit threshold curve +- `blend_dissection/`: Phase 10A.5 blend mechanism dissection |
