diff options
| author | YurenHao0426 <blackhao0426@gmail.com> | 2026-01-27 15:43:42 -0600 |
|---|---|---|
| committer | YurenHao0426 <blackhao0426@gmail.com> | 2026-01-27 15:43:42 -0600 |
| commit | f918fc90b8d71d1287590b016d926268be573de0 (patch) | |
| tree | d9009c8612c8e7f866c31d22fb979892a5b55eeb /src/personalization/models/llm/base.py | |
| parent | 680513b7771a29f27cbbb3ffb009a69a913de6f9 (diff) | |
Add model wrapper modules (embedding, reranker, llm, preference_extractor)
Add Python wrappers for:
- Qwen3/Nemotron embedding models
- BGE/Qwen3 rerankers
- vLLM/Llama/Qwen LLM backends
- GPT-4o/LLM-based preference extractors
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Diffstat (limited to 'src/personalization/models/llm/base.py')
| -rw-r--r-- | src/personalization/models/llm/base.py | 29 |
1 files changed, 29 insertions, 0 deletions
diff --git a/src/personalization/models/llm/base.py b/src/personalization/models/llm/base.py new file mode 100644 index 0000000..72b6ca8 --- /dev/null +++ b/src/personalization/models/llm/base.py @@ -0,0 +1,29 @@ +from typing import List, Protocol, Optional +from personalization.types import ChatTurn + +class ChatModel(Protocol): + def answer( + self, + history: List[ChatTurn], + memory_notes: List[str], + max_new_tokens: int = 512, + temperature: float = 0.7, + top_p: float = 0.9, + top_k: Optional[int] = None, + ) -> str: + """ + Generate an assistant response given conversation history and memory notes. + + Args: + history: The conversation history ending with the current user turn. + memory_notes: List of retrieved memory content strings. + max_new_tokens: Max tokens to generate. + temperature: Sampling temperature. + top_p: Top-p sampling. + top_k: Top-k sampling. + + Returns: + The generated assistant response text. + """ + ... + |
