diff options
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/method_additions.md | 120 |
1 files changed, 120 insertions, 0 deletions
diff --git a/docs/method_additions.md b/docs/method_additions.md new file mode 100644 index 0000000..82b2c2f --- /dev/null +++ b/docs/method_additions.md @@ -0,0 +1,120 @@ +# Method Additions: Query Transformation, Global Preferences, Hyperparameters + +Three additions to the Method section (Section 3). + +--- + +## 1. Query Transformation (add to Section 3.5, after "Personalized Retrieval" paragraph) + +```latex +\paragraph{Query transformation.} +A practical challenge for dense retrieval is the semantic gap +between task-oriented user queries (e.g., ``solve this +integral'') and preference descriptions (e.g., ``when math +problems, show step-by-step work''). +To bridge this gap, we apply a lightweight keyword-based +query transformation before dense retrieval. + +Given a user query $q_t$, we detect the task type (math, +coding, writing, or explanation) by matching against curated +keyword lists. +If a task type is detected, we construct a supplementary query +\[ + q'_t = \texttt{"user preferences for \{task\_type\} tasks: "} \| \; q_t +\] +and perform multi-query dense retrieval: both $q_t$ and $q'_t$ +are embedded, and for each memory card we take the +\emph{maximum} cosine similarity across the two query +embeddings. +The top-$k$ candidates by this max-similarity are then passed +to the reranker, which still uses only the original query +$q_t$. +This simple transformation improves recall of task-relevant +preferences without introducing an additional LLM call. +``` + +--- + +## 2. Global vs Conditional Preferences (add to Section 3.4, after "Memory cards" paragraph) + +```latex +\paragraph{Global vs.\ conditional preferences.} +Not all preferences require retrieval. +Some preferences are universally applicable regardless of +task context (e.g., ``always respond in Chinese'', +``use numbered lists''), while others are +conditional on the task type (e.g., ``when coding, include +type hints''). +At extraction time, we classify each preference as +\emph{global} or \emph{conditional} based on its condition +field: +a preference is classified as global if its condition +contains universal indicators (e.g., ``general'', ``always'', +``any task'') or consists of fewer than three words with no +domain-specific terms (e.g., ``math'', ``code''). + +Global preferences bypass the retrieval pipeline entirely +and are always injected into the agent prompt (up to a cap +of $10$), ensuring that universally applicable preferences +are never missed due to retrieval failure. +Only conditional preferences enter the dense retrieval and +reranking pipeline described above. +This two-tier design reduces the retrieval burden and +guarantees that high-frequency, always-applicable preferences +are consistently applied. +``` + +--- + +## 3. Hyperparameter Table (add to Section 4, after Models subsection or as a new subsection) + +```latex +\subsection{Hyperparameters} +\label{sec:setup-hyperparams} + +Table~\ref{tab:hyperparams} lists the key hyperparameters +used in all experiments. +These values are set heuristically and held fixed across all +methods and profiles. + +\begin{table}[t] + \centering + \small + \caption{Hyperparameters used in all experiments.} + \label{tab:hyperparams} + \begin{tabular}{llc} + \toprule + Component & Parameter & Value \\ + \midrule + \multirow{4}{*}{User vector} + & Item-space dimension $k$ & 256 \\ + & Long-term weight $\beta_L$ & 2.0 \\ + & Short-term weight $\beta_S$ & 5.0 \\ + & Softmax temperature $\tau$ & 1.0 \\ + \midrule + \multirow{4}{*}{REINFORCE} + & Long-term learning rate $\eta_L$ & 0.01 \\ + & Short-term learning rate $\eta_S$ & 0.05 \\ + & Short-term decay $\lambda$ & 0.1 \\ + & Baseline EMA coefficient $\alpha$ & 0.05 \\ + \midrule + \multirow{2}{*}{Retrieval} + & Dense retrieval top-$k$ & 64 \\ + & Reranker top-$k$ & 5 \\ + \midrule + \multirow{2}{*}{Global prefs} + & Max global notes in prompt & 10 \\ + & Min condition words (global) & $\leq 2$ \\ + \midrule + \multirow{2}{*}{Embedding} + & Embedding dimension $d$ & 4096 \\ + & PCA components $k$ & 256 \\ + \midrule + \multirow{3}{*}{Interaction} + & Sessions per profile & 60 \\ + & Max turns per session & 10 \\ + & Max generation tokens & 512 \\ + \bottomrule + \end{tabular} +\end{table} +``` |
