[Bug] Hash collision in prefix KV cache silently reuses wrong KV block

The prefix cache identifies KV blocks by a content-based hash. If two different token sequences produce the same hash (collision), get_cached_block returns the wrong block with no error, causing the model to attend over incorrect KV values and producing corrupted output.

Probability of this occuring is low but non-zero, and it's a silent failure.

#### Cause
KVCacheBlock stores only m_hash. On a cache hit, there is no way to verify the block actually corresponds to the expected tokens. The code has a `// TODO: add tokens validation in case of hash collision` for this. I intend to work on this.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Hash collision in prefix KV cache silently reuses wrong KV block #3722

Cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Hash collision in prefix KV cache silently reuses wrong KV block #3722

Description

Cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions