summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-02-10 20:28:22 +0000
committerYurenHao0426 <blackhao0426@gmail.com>2026-02-10 20:28:22 +0000
commit0c39a60d34ad8aff7b61b244c19bfd0160d9b446 (patch)
tree8e130be5d80fc13e17fde0008526bb3b149a6166
parent440ef7dedf4198a15abb57e17f4a6e189657d810 (diff)
Add E/T decomposition analysis to notes
- E/T difference 79% from slightly more enforcements, 20% from fewer turns - Neither component individually significant - rag_vector achieves results in fewer turns with lower user effort Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
-rw-r--r--notes.md13
1 files changed, 13 insertions, 0 deletions
diff --git a/notes.md b/notes.md
index 8560152..78bd381 100644
--- a/notes.md
+++ b/notes.md
@@ -416,6 +416,19 @@ Wilcoxon signed-rank (non-parametric) 结果一致:
E/T偏高可解释为: retrieval方法surface更多specific preferences,导致user给出更targeted feedback。
+### E/T分解分析
+
+| 因素 | reflection | rag_vector | diff | 对E/T贡献 |
+|------|-----------|-----------|------|----------|
+| Enforcements/session | 1.47 | 1.54 | +0.07 (+4.8%) | **79%** |
+| Turns/session | 8.41 | 8.31 | -0.10 (-1.2%) | 20% |
+| E/T | 0.175 | 0.185 | +0.011 (+6.0%) | |
+
+- Enforcements差异 marginally significant (p=0.058),turns差异不显著 (p=0.19)
+- E/T偏高79%来自enforcements略多,20%来自turns略少
+- rag_vector用更少turns完成任务 → 整体交互效率更高
+- **报告说法**: E/T差异不显著,而rag_vector用更少turns和更低user effort完成任务,说明整体交互效率更高
+
---
## 后续计划