Skip to content

Late Chunking (https://arxiv.org/pdf/2409.04701) #32618

@oskrim

Description

@oskrim

Is your feature request related to a problem? Please describe.
Practitioners often split text documents into smaller chunks and embed them separately. However, chunk embeddings created in this way can lose contextual information from surrounding chunks, resulting in sub-optimal representations

Describe the solution you'd like
Most likely a new embedder or an option on the Huggingface embedder would need to be implemented to support this

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions