diff options
Diffstat (limited to 'notes.md')
| -rw-r--r-- | notes.md | 13 |
1 files changed, 13 insertions, 0 deletions
@@ -416,6 +416,19 @@ Wilcoxon signed-rank (non-parametric) 结果一致: E/T偏高可解释为: retrieval方法surface更多specific preferences,导致user给出更targeted feedback。 +### E/T分解分析 + +| 因素 | reflection | rag_vector | diff | 对E/T贡献 | +|------|-----------|-----------|------|----------| +| Enforcements/session | 1.47 | 1.54 | +0.07 (+4.8%) | **79%** | +| Turns/session | 8.41 | 8.31 | -0.10 (-1.2%) | 20% | +| E/T | 0.175 | 0.185 | +0.011 (+6.0%) | | + +- Enforcements差异 marginally significant (p=0.058),turns差异不显著 (p=0.19) +- E/T偏高79%来自enforcements略多,20%来自turns略少 +- rag_vector用更少turns完成任务 → 整体交互效率更高 +- **报告说法**: E/T差异不显著,而rag_vector用更少turns和更低user effort完成任务,说明整体交互效率更高 + --- ## 后续计划 |
