Skip to content

fix(generate): avoid None entries in merged logits_processors#1230

Open
BLuchterhand wants to merge 1 commit intoml-explore:mainfrom
BLuchterhand:fix/logits-processors-none-element-iteration
Open

fix(generate): avoid None entries in merged logits_processors#1230
BLuchterhand wants to merge 1 commit intoml-explore:mainfrom
BLuchterhand:fix/logits-processors-none-element-iteration

Conversation

@BLuchterhand
Copy link
Copy Markdown

@BLuchterhand BLuchterhand commented Apr 29, 2026

Bug

PromptProcessingBatch.extend filled missing per-slot logits_processors with [None] * N. Merging an unconfigured batch with a processor-equipped batch produced a list shaped [None, ..., [fn], ...]. GenerationBatch._step (line 1346) iterates self.logits_processors[e] under the any() guard at line 1337, which raises TypeError: 'NoneType' object is not iterable on the None slots.

Reproduce

a = PromptProcessingBatch.empty(model, fallback)
a.uids = [0]; a.logits_processors = []
b = PromptProcessingBatch.empty(model, fallback)
b.uids = [1]; b.logits_processors = [make_logits_processors({0: 2000.0})]
a.extend(b)
# a.logits_processors == [None, [<fn>]]   ← later crashes _step

Fix

-            self.logits_processors = [None] * len(self.uids)
+            self.logits_processors = [[]] * len(self.uids)
...
-            else [None] * len(batch.uids)
+            else [[]] * len(batch.uids)

Per-slot type is List[Callable], so the absent-value sentinel should be [], not None. Matches the existing [[]] * len(keep) at line 1120 (PromptProcessingBatch.filter).

Test

test_prompt_processing_batch_extend_mixes_logits_processors in tests/test_generate.py. Asserts no None entries remain in the merged list. Fails on main (AssertionError: None is not an instance of <class 'list'>), passes with the fix.

Related

#1225 fixes the symmetric stale-length bug in GenerationBatch.filter. This is situated in a different code path so both can land independently.

Traceback (production hit)

File "mlx_lm/generate.py", line 1346, in _step
    for processor in self.logits_processors[e]:
TypeError: 'NoneType' object is not iterable

@BLuchterhand BLuchterhand force-pushed the fix/logits-processors-none-element-iteration branch 2 times, most recently from b279bd1 to b58f987 Compare April 29, 2026 21:37
@BLuchterhand BLuchterhand marked this pull request as draft April 29, 2026 21:42
PromptProcessingBatch.extend filled missing per-slot logits_processors
with [None] when either side lacked configured processors. Merging an
unconfigured batch with a processor-equipped batch then produced a list
shaped like [None, ..., [fn], ...]. GenerationBatch._step at line 1346
iterates self.logits_processors[e] under the any() guard at line 1337,
which raises TypeError on the None slots.

Fill with [[]] instead. Matches the existing pattern at line 1120
(filter() restoring [[]] * len(keep)) and the per-slot type
List[Callable].

Reproduce: construct two PromptProcessingBatch instances, one without
processors and one with, then call extend; the merged
self.logits_processors contains None entries. New unit test covers this
shape directly.
@BLuchterhand BLuchterhand changed the title fix(generate): handle None entries in GenerationBatch logits_processors fix(generate): avoid None entries in merged logits_processors Apr 29, 2026
@BLuchterhand BLuchterhand force-pushed the fix/logits-processors-none-element-iteration branch from b58f987 to 423301b Compare April 29, 2026 21:47
@BLuchterhand BLuchterhand marked this pull request as ready for review April 29, 2026 21:47
@mloiterman
Copy link
Copy Markdown

+1 — independent production hit, batched-inference server, mlx-lm 0.31.3.

Our workload mixes grammar-mode requests (RAG hybrid search using response_format=json_schema) with plain chat requests, both arriving concurrently and batched together by BatchGenerator. The mixed batch is exactly the shape your repro produces: at least one request with logits_processors and at least one without. Single-workload runs (all-grammar or all-plain) don't trip it.

Live traceback at batch_size=8:

File "mlx_lm/generate.py", line 1346, in _step
    for processor in self.logits_processors[e]:
TypeError: 'NoneType' object is not iterable

The diagnosis matches yours exactly. The supervisor in our setup catches the crash via BrokenPipeError, respawns the engine subprocess (~60 s), and in-flight requests return 500 — but that's our restart machinery doing its job, not a real recovery. Without the patch the bug is fatal to any heterogeneous batched workload.

We're applying your diff to our local .venv as a stopgap (mirrors how we're running #1225's patch today, since neither has merged yet). Vouching for the fix shape: [] is the right sentinel for List[Callable], matching the existing [[]] * len(keep) in PromptProcessingBatch.filter:1120.

Validated locally: a regression probe that fires a long plain request followed by a grammar request ~300 ms later (so the grammar request extend()s the existing plain batch) reliably FAILS with 500 {"error":"Generation failed"} on the un-patched engine, and PASSES once the engine respawns onto the patched code.

Related: #1225 lands the symmetric fix in GenerationBatch.filter. Different code paths, same sentinel argument — both are safe to land independently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants