summaryrefslogtreecommitdiff
path: root/report_explore
diff options
context:
space:
mode:
authorYurenHao0426 <Blackhao0426@gmail.com>2026-03-26 22:07:35 -0500
committerYurenHao0426 <Blackhao0426@gmail.com>2026-03-26 22:07:35 -0500
commitb4e3cbeae6cb4cf4a4b69b84a475afcd7d7e9dbe (patch)
treefca5a27504471091eba74a8f7efe2cf48eb85826 /report_explore
parent610e1169e19378cccd2d9b92a588c24dca7f3df7 (diff)
Add Phase 10A.6: gain requires trainable depth-aware aux, not semantic credit
9-branch dissection results: - zero_target crashes (-9.1%): aux must output non-zero - constant_input neutral (+0.0%): needs at least depth info - time_only works (+1.0%): h_l not needed, just depth index - shuffled/fresh_random work (+1.3-1.4%): no semantic content needed - prefit60_trainable ≈ random_trainable: prefit adds nothing - All frozen branches crash: trainability is essential Mechanism: depth-aware trainable auxiliary perturbation that diversifies block-local updates. Not semantic credit, not pure trainability. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diffstat (limited to 'report_explore')
-rw-r--r--report_explore/MEMO_10A6_structured_vs_semantic.md36
1 files changed, 36 insertions, 0 deletions
diff --git a/report_explore/MEMO_10A6_structured_vs_semantic.md b/report_explore/MEMO_10A6_structured_vs_semantic.md
new file mode 100644
index 0000000..c63b5ea
--- /dev/null
+++ b/report_explore/MEMO_10A6_structured_vs_semantic.md
@@ -0,0 +1,36 @@
+# Phase 10A.6 Memo: Structured vs Semantic Auxiliary
+
+**Date**: 2026-03-26
+
+## Question
+Is the blend gain from semantic credit, structured trainable signal, or trainable regularization?
+
+## Results
+
+| Branch | final | diff vs DFA | Interpretation |
+|--------|-------|-------------|----------------|
+| continue_DFA | 0.312 | — | baseline |
+| random_trainable | 0.324 | +1.2% | works |
+| shuffled_trainable | 0.325 | +1.4% | works (no semantics needed) |
+| **zero_target** | **0.221** | **-9.1%** | **crashes** (must output non-zero) |
+| fresh_random_target | 0.325 | +1.3% | works (stable targets not needed) |
+| time_only | 0.321 | +1.0% | works (h_l not needed, just depth) |
+| **constant_input** | **0.312** | **+0.0%** | **neutral** (needs at least depth info) |
+| prefit60_frozen | 0.127 | -18.4% | crashes (frozen = bad) |
+| prefit60_trainable | 0.321 | +1.0% | works but ≈ random_trainable |
+
+## Mechanism Identified
+
+The gain requires: **(1) trainable, (2) non-zero output, (3) at least depth-aware**.
+
+- **Not semantic credit**: shuffled and fresh_random targets work equally well
+- **Not pure trainability**: zero_target crashes (the aux must actually output something)
+- **Not state-dependent**: time_only (no h_l) works almost as well as full Vec
+- **Depth-awareness matters**: constant_input (no depth info) doesn't help
+- **Prefit adds nothing**: prefit60_trainable ≈ random_trainable
+
+## Conclusion
+
+The mechanism is a **depth-aware trainable auxiliary perturbation** that diversifies the block-local update directions beyond what DFA alone provides. It doesn't need to be semantically correct credit — it just needs to be a non-trivial, evolving, depth-dependent signal that prevents blocks from collapsing into the DFA-only fixed-direction regime.
+
+This is NOT evidence for the credit bridge hypothesis. The gain is an optimization dynamics phenomenon unrelated to credit quality.