diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-02-11 03:14:37 +0000 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-02-11 03:14:37 +0000 |
| commit | 6a917d3eda85e5725c2d5ad3bf5ec9bd30262198 (patch) | |
| tree | 5c9408962f01036119ebe29cd34b45bf951865bd /src/personalization/serving/api/routes/feedback.py | |
| parent | 1956aed8bc8a72355adbe9f1d16ea678d67f214c (diff) | |
Rewrite reward section to describe keyword heuristic (matches experiments)
Replaced LLM-as-judge description with actual keyword-based system:
- Reward: sentiment keyword matching + topic coherence via embedding similarity
- Gating: separate retrieval-attribution heuristic using memory-query cosine
similarity (g_t=0.9 retrieval fault, g_t=0.2 LLM fault, etc.)
- No additional model needed (fast, no GPU)
- REINFORCE update unchanged
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to 'src/personalization/serving/api/routes/feedback.py')
0 files changed, 0 insertions, 0 deletions
