summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
authorYurenHao0426 <blackhao0426@gmail.com>2026-02-11 03:33:47 +0000
committerYurenHao0426 <blackhao0426@gmail.com>2026-02-11 03:33:47 +0000
commita9813be5d6f0bf3fe40e0327605612d1c3f925da (patch)
tree35a3f83a0e281cd53d1193f3147db7b77f1a3dbb /docs
parentdcc20b1f77702e5b45e2e6c08b0f243124c4676e (diff)
Add query transformation, global preferences, and hyperparameter table
Three additions to Method/Setup sections: 1. Query transformation: keyword-based task detection + multi-query dense retrieval to bridge semantic gap (Section 3.5) 2. Global vs conditional preferences: universal prefs bypass retrieval, always injected into prompt (Section 3.4) 3. Hyperparameter table with all key values (Section 4) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Diffstat (limited to 'docs')
-rw-r--r--docs/method_additions.md120
1 files changed, 120 insertions, 0 deletions
diff --git a/docs/method_additions.md b/docs/method_additions.md
new file mode 100644
index 0000000..82b2c2f
--- /dev/null
+++ b/docs/method_additions.md
@@ -0,0 +1,120 @@
+# Method Additions: Query Transformation, Global Preferences, Hyperparameters
+
+Three additions to the Method section (Section 3).
+
+---
+
+## 1. Query Transformation (add to Section 3.5, after "Personalized Retrieval" paragraph)
+
+```latex
+\paragraph{Query transformation.}
+A practical challenge for dense retrieval is the semantic gap
+between task-oriented user queries (e.g., ``solve this
+integral'') and preference descriptions (e.g., ``when math
+problems, show step-by-step work'').
+To bridge this gap, we apply a lightweight keyword-based
+query transformation before dense retrieval.
+
+Given a user query $q_t$, we detect the task type (math,
+coding, writing, or explanation) by matching against curated
+keyword lists.
+If a task type is detected, we construct a supplementary query
+\[
+ q'_t = \texttt{"user preferences for \{task\_type\} tasks: "} \| \; q_t
+\]
+and perform multi-query dense retrieval: both $q_t$ and $q'_t$
+are embedded, and for each memory card we take the
+\emph{maximum} cosine similarity across the two query
+embeddings.
+The top-$k$ candidates by this max-similarity are then passed
+to the reranker, which still uses only the original query
+$q_t$.
+This simple transformation improves recall of task-relevant
+preferences without introducing an additional LLM call.
+```
+
+---
+
+## 2. Global vs Conditional Preferences (add to Section 3.4, after "Memory cards" paragraph)
+
+```latex
+\paragraph{Global vs.\ conditional preferences.}
+Not all preferences require retrieval.
+Some preferences are universally applicable regardless of
+task context (e.g., ``always respond in Chinese'',
+``use numbered lists''), while others are
+conditional on the task type (e.g., ``when coding, include
+type hints'').
+At extraction time, we classify each preference as
+\emph{global} or \emph{conditional} based on its condition
+field:
+a preference is classified as global if its condition
+contains universal indicators (e.g., ``general'', ``always'',
+``any task'') or consists of fewer than three words with no
+domain-specific terms (e.g., ``math'', ``code'').
+
+Global preferences bypass the retrieval pipeline entirely
+and are always injected into the agent prompt (up to a cap
+of $10$), ensuring that universally applicable preferences
+are never missed due to retrieval failure.
+Only conditional preferences enter the dense retrieval and
+reranking pipeline described above.
+This two-tier design reduces the retrieval burden and
+guarantees that high-frequency, always-applicable preferences
+are consistently applied.
+```
+
+---
+
+## 3. Hyperparameter Table (add to Section 4, after Models subsection or as a new subsection)
+
+```latex
+\subsection{Hyperparameters}
+\label{sec:setup-hyperparams}
+
+Table~\ref{tab:hyperparams} lists the key hyperparameters
+used in all experiments.
+These values are set heuristically and held fixed across all
+methods and profiles.
+
+\begin{table}[t]
+ \centering
+ \small
+ \caption{Hyperparameters used in all experiments.}
+ \label{tab:hyperparams}
+ \begin{tabular}{llc}
+ \toprule
+ Component & Parameter & Value \\
+ \midrule
+ \multirow{4}{*}{User vector}
+ & Item-space dimension $k$ & 256 \\
+ & Long-term weight $\beta_L$ & 2.0 \\
+ & Short-term weight $\beta_S$ & 5.0 \\
+ & Softmax temperature $\tau$ & 1.0 \\
+ \midrule
+ \multirow{4}{*}{REINFORCE}
+ & Long-term learning rate $\eta_L$ & 0.01 \\
+ & Short-term learning rate $\eta_S$ & 0.05 \\
+ & Short-term decay $\lambda$ & 0.1 \\
+ & Baseline EMA coefficient $\alpha$ & 0.05 \\
+ \midrule
+ \multirow{2}{*}{Retrieval}
+ & Dense retrieval top-$k$ & 64 \\
+ & Reranker top-$k$ & 5 \\
+ \midrule
+ \multirow{2}{*}{Global prefs}
+ & Max global notes in prompt & 10 \\
+ & Min condition words (global) & $\leq 2$ \\
+ \midrule
+ \multirow{2}{*}{Embedding}
+ & Embedding dimension $d$ & 4096 \\
+ & PCA components $k$ & 256 \\
+ \midrule
+ \multirow{3}{*}{Interaction}
+ & Sessions per profile & 60 \\
+ & Max turns per session & 10 \\
+ & Max generation tokens & 512 \\
+ \bottomrule
+ \end{tabular}
+\end{table}
+```