I'm using huggingface-text-embedding-inference to serve an embedding model. It has a max batch size of 64, and occasionally opencode-codebase-index goes slightly over, leading to the following errors in the hftei logs:
hftei-gina-1 | {"timestamp":"2026-03-29T00:38:18.185898Z","level":"ERROR","message":"batch size 66 > maximum allowed batch size 64","target":"text_embeddings_router::http::server","filename":"router/src/http/server.rs","line_number":1233,"span":{"name":"openai_embed"},"spans":[{"name":"openai_embed"}]}
hftei-gina-1 | {"timestamp":"2026-03-29T00:38:18.296007Z","level":"ERROR","message":"batch size 68 > maximum allowed batch size 64","target":"text_embeddings_router::http::server","filename":"router/src/http/server.rs","line_number":1233,"span":{"name":"openai_embed"},"spans":[{"name":"openai_embed"}]}
hftei-gina-1 | {"timestamp":"2026-03-29T00:38:19.015925Z","level":"ERROR","message":"batch size 68 > maximum allowed batch size 64","target":"text_embeddings_router::http::server","filename":"router/src/http/server.rs","line_number":1233,"span":{"name":"openai_embed"},"spans":[{"name":"openai_embed"}]}
Would it be possible to add a max batch size to the custom provider definition so I can fully index my codebase without hitting the limit?
Problem
I'm using huggingface-text-embedding-inference to serve an embedding model. It has a max batch size of 64, and occasionally opencode-codebase-index goes slightly over, leading to the following errors in the hftei logs:
Would it be possible to add a max batch size to the custom provider definition so I can fully index my codebase without hitting the limit?
Proposed Solution
I'd like a
max_batch_sizeconfiguration parameter in the custom provider configuration that limits how many requests are sent as a batch.Alternatives Considered
N/A
Additional Context
N/A