summaryrefslogtreecommitdiff
path: root/report_explore/MEMO_10A6_structured_vs_semantic.md
blob: c63b5ea0061b7c6835200800cfe5de28d2bb702d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# Phase 10A.6 Memo: Structured vs Semantic Auxiliary

**Date**: 2026-03-26

## Question
Is the blend gain from semantic credit, structured trainable signal, or trainable regularization?

## Results

| Branch | final | diff vs DFA | Interpretation |
|--------|-------|-------------|----------------|
| continue_DFA | 0.312 | — | baseline |
| random_trainable | 0.324 | +1.2% | works |
| shuffled_trainable | 0.325 | +1.4% | works (no semantics needed) |
| **zero_target** | **0.221** | **-9.1%** | **crashes** (must output non-zero) |
| fresh_random_target | 0.325 | +1.3% | works (stable targets not needed) |
| time_only | 0.321 | +1.0% | works (h_l not needed, just depth) |
| **constant_input** | **0.312** | **+0.0%** | **neutral** (needs at least depth info) |
| prefit60_frozen | 0.127 | -18.4% | crashes (frozen = bad) |
| prefit60_trainable | 0.321 | +1.0% | works but ≈ random_trainable |

## Mechanism Identified

The gain requires: **(1) trainable, (2) non-zero output, (3) at least depth-aware**.

- **Not semantic credit**: shuffled and fresh_random targets work equally well
- **Not pure trainability**: zero_target crashes (the aux must actually output something)
- **Not state-dependent**: time_only (no h_l) works almost as well as full Vec
- **Depth-awareness matters**: constant_input (no depth info) doesn't help
- **Prefit adds nothing**: prefit60_trainable ≈ random_trainable

## Conclusion

The mechanism is a **depth-aware trainable auxiliary perturbation** that diversifies the block-local update directions beyond what DFA alone provides. It doesn't need to be semantically correct credit — it just needs to be a non-trivial, evolving, depth-dependent signal that prevents blocks from collapsing into the DFA-only fixed-direction regime.

This is NOT evidence for the credit bridge hypothesis. The gain is an optimization dynamics phenomenon unrelated to credit quality.