From 0c39a60d34ad8aff7b61b244c19bfd0160d9b446 Mon Sep 17 00:00:00 2001 From: YurenHao0426 Date: Tue, 10 Feb 2026 20:28:22 +0000 Subject: Add E/T decomposition analysis to notes - E/T difference 79% from slightly more enforcements, 20% from fewer turns - Neither component individually significant - rag_vector achieves results in fewer turns with lower user effort Co-Authored-By: Claude Opus 4.6 --- notes.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/notes.md b/notes.md index 8560152..78bd381 100644 --- a/notes.md +++ b/notes.md @@ -416,6 +416,19 @@ Wilcoxon signed-rank (non-parametric) 结果一致: E/T偏高可解释为: retrieval方法surface更多specific preferences,导致user给出更targeted feedback。 +### E/T分解分析 + +| 因素 | reflection | rag_vector | diff | 对E/T贡献 | +|------|-----------|-----------|------|----------| +| Enforcements/session | 1.47 | 1.54 | +0.07 (+4.8%) | **79%** | +| Turns/session | 8.41 | 8.31 | -0.10 (-1.2%) | 20% | +| E/T | 0.175 | 0.185 | +0.011 (+6.0%) | | + +- Enforcements差异 marginally significant (p=0.058),turns差异不显著 (p=0.19) +- E/T偏高79%来自enforcements略多,20%来自turns略少 +- rag_vector用更少turns完成任务 → 整体交互效率更高 +- **报告说法**: E/T差异不显著,而rag_vector用更少turns和更低user effort完成任务,说明整体交互效率更高 + --- ## 后续计划 -- cgit v1.2.3