Prerequisites
Expected Behavior
When constructing Llama with the older spelling embedding=True (singular — the parameter name in 0.2.x), one of two things should happen:
- The kwarg is accepted as a deprecated alias of
embeddings and a DeprecationWarning is emitted, OR
- A
TypeError is raised at construction time, surfacing the issue at the call site rather than swallowing it silently.
Current Behavior
Neither happens. embedding=True is silently swallowed via **kwargs, context_params.embeddings stays at its default False, and the failure surfaces much later — deep inside .embed() — with a misleading error message that suggests the user didn't pass the flag, when in fact they did (just under the historical name).
RuntimeError: Llama model must be created with embeddings=True to call this method
This is especially painful for users integrating older libraries that haven't migrated to the new spelling yet — the error points at the wrong thing.
Environment and Context
- Hardware: x86_64, NVIDIA GeForce RTX 4090
- OS: Windows 10 22H2
- Python 3.12.9
- llama-cpp-python 0.3.36 (CUDA 12.8 prebuilt wheel)
$ python --version
Python 3.12.9
$ pip show llama-cpp-python | findstr Version
Version: 0.3.36
Failure Information (for bugs)
The constructor's **kwargs swallows unknown keyword arguments with no warning, so a typo or stale parameter name produces a delayed, confusing failure rather than an immediate error.
Steps to Reproduce
from llama_cpp import Llama
# Pass the older `embedding` (singular) instead of `embeddings` (plural).
m = Llama(model_path="path/to/model.gguf", embedding=True)
m.embed("hello")
Result:
RuntimeError: Llama model must be created with embeddings=True to call this method
Even though embedding=True was passed at construction. The fix is to either accept embedding as a deprecated alias or to validate kwargs strictly.
Failure Logs
Traceback (most recent call last):
File "...\Lib\site-packages\llama_cpp\llama.py", line 1602, in embed
raise RuntimeError(
RuntimeError: Llama model must be created with embeddings=True to call this method
Hit while integrating Tencent's HY-Motion text-to-motion model — the hymotion package's text encoder still uses the older embedding= spelling, so anyone running it against llama-cpp-python 0.3.x sees this confusing failure at first inference instead of at construction. Workaround in our case is a runtime monkey-patch that translates embedding → embeddings in Llama.__init__, but that doesn't help anyone else.
Prerequisites
Expected Behavior
When constructing
Llamawith the older spellingembedding=True(singular — the parameter name in 0.2.x), one of two things should happen:embeddingsand aDeprecationWarningis emitted, ORTypeErroris raised at construction time, surfacing the issue at the call site rather than swallowing it silently.Current Behavior
Neither happens.
embedding=Trueis silently swallowed via**kwargs,context_params.embeddingsstays at its defaultFalse, and the failure surfaces much later — deep inside.embed()— with a misleading error message that suggests the user didn't pass the flag, when in fact they did (just under the historical name).This is especially painful for users integrating older libraries that haven't migrated to the new spelling yet — the error points at the wrong thing.
Environment and Context
Failure Information (for bugs)
The constructor's
**kwargsswallows unknown keyword arguments with no warning, so a typo or stale parameter name produces a delayed, confusing failure rather than an immediate error.Steps to Reproduce
Result:
Even though
embedding=Truewas passed at construction. The fix is to either acceptembeddingas a deprecated alias or to validate kwargs strictly.Failure Logs
Hit while integrating Tencent's HY-Motion text-to-motion model — the
hymotionpackage's text encoder still uses the olderembedding=spelling, so anyone running it against llama-cpp-python 0.3.x sees this confusing failure at first inference instead of at construction. Workaround in our case is a runtime monkey-patch that translatesembedding→embeddingsinLlama.__init__, but that doesn't help anyone else.