personalization-user-model.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author
2026-02-11	Add revised reward modeling LaTeX section matching code implementation	YurenHao0426
	Key changes from original: - Input: (q_t, a_t, q_{t+1}) only, removed A_t (not used in judge prompt) - Single 7-label LLM classifier replaces abstract C_reward/C_gate - Gating = classifier confidence (threshold tau_c=0.6), not memory attribution - Explicitly describes Llama-3.1-8B-Instruct as judge model Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>