Skip to content

Commit 299cafd

Browse files
authored
run backbone model only for prefill
1 parent 98e1327 commit 299cafd

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

kvpress/pipeline.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -195,11 +195,11 @@ def _forward(
195195
cache = DynamicCache()
196196

197197
with press(self.model) if press is not None else contextlib.nullcontext():
198-
self.model(
198+
# We run the model without the lm head for pre-filling.
199+
self.model.model(
199200
input_ids=context_ids,
200201
past_key_values=cache,
201202
output_attentions=self.output_attentions(press),
202-
num_logits_to_keep=1,
203203
)
204204

205205
logger.debug(f"Context Length: {context_length}")

0 commit comments

Comments
 (0)