Skip to content

Commit ad1989d

Browse files
committed
fix readme
Signed-off-by: Dominic789654 <xliu29@gmu.edu>
1 parent 99cea94 commit ad1989d

File tree

2 files changed

+1
-2
lines changed

2 files changed

+1
-2
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ Finally we provide wrapper presses that can be combined with other presses:
7777
- `PerLayerCompressionPress` ([source](kvpress/presses/per_layer_compression_press.py)): compress each layer with a different compression ratio (experimental)
7878
- `ComposedPress` ([source](kvpress/presses/composed_press.py)): compose multiple presses together by chaining their forward hooks
7979
- `KeyRerotationPress` ([source](kvpress/presses/key_rerotation_press.py)): rerotate pruned keys to have continuous RoPE embeddings
80-
- `ChunkKVPress` ([source](kvpress/presses/chunkkv_press.py), [paper](https://arxiv.org/abs/2502.00299)): implements the ChunkKV compression method that selects whole chunks based on their importance scores. This approach differs from ChunkPress by maintaining chunk-level granularity during selection, which helps preserve local attention patterns. The method is particularly effective for long sequences where maintaining contextual coherence is important.
80+
- `ChunkKVPress` ([source](kvpress/presses/chunkkv_press.py), [paper](https://arxiv.org/abs/2502.00299)): compresses by selecting important chunks, preserving semantic coherence
8181
- `ChunkPress` ([source](kvpress/presses/chunk_press.py), [paper](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00716/125280)): compress the KV cache on each sequence chunk separately. This can yield to more uniform compression across long sequences
8282
- `CriticalKVPress` and `CriticalAdaKVPress` ([source](kvpress/presses/criticalkv_press.py), [paper](https://arxiv.org/abs/2502.03805)): refine the scores using the L1 norm of Wo @ values, coupled with a two-stage selection.
8383

kvpress/presses/chunk_press.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@ def compress(
5050
assert attentions is None, "ChunkPress does not support attentions."
5151

5252
kv_len = keys.shape[2]
53-
5453
indices = []
5554
for i in range(0, kv_len, self.chunk_length):
5655
chunk_scores = self.press.score(

0 commit comments

Comments
 (0)