-
Notifications
You must be signed in to change notification settings - Fork 658
Pull requests: ml-explore/mlx-lm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Fix ArraysCache missing is_trimmable/trim for hybrid model prompt cache
#1254
opened May 6, 2026 by
EagerofLight
Loading…
Fix BatchRotatingKVCache rotated flag deserializing to True
#1251
opened May 6, 2026 by
odysa
Loading…
Fix mlx_lm.server --adapter-path silently ignored at startup
#1249
opened May 6, 2026 by
odysa
Loading…
fix: wrap ast.literal_eval in try/except for Qwen3 tool parser
#1239
opened May 3, 2026 by
lawcontinue
Loading…
fix(gemma4): add stop_gradient on MoE router top_k_indices
#1238
opened May 2, 2026 by
TrentCarter
Loading…
[transformers-to-mlx skill] Add Talkie (TalkieForCausalLM) model
#1231
opened Apr 30, 2026 by
warshanks
Loading…
fix(generate): avoid None entries in merged logits_processors
#1230
opened Apr 29, 2026 by
BLuchterhand
Loading…
Skip lm_head on non-rank-0 pipeline-parallel ranks
#1228
opened Apr 29, 2026 by
lawcontinue
Loading…
[transformers-to-mlx skill] Add bailing_hybrid (Ling-2.6-flash) model
#1227
opened Apr 29, 2026 by
ivanfioravanti
Contributor
Loading…
generate.py: GenerationBatch.filter — add else branches so logits_processors / samplers length stays in lockstep with uids
#1225
opened Apr 28, 2026 by
mloiterman
Loading…
Support per-expert MoE checkpoints in qwen3_5_moe.sanitize, plus FP8 dequant
#1224
opened Apr 28, 2026 by
sdayal
Loading…
feat(server): opt-in disk-backed L2 prompt cache (--prompt-cache-disk-dir)
#1218
opened Apr 27, 2026 by
freddyhaddad
Loading…
Add Metal VJP kernel for gated_delta_update (trainable Qwen3.5 / Qwen3-Next LoRA on Apple Silicon)
#1217
opened Apr 27, 2026 by
SudarkinV
Loading…
fix(utils): skip already-quantized layers in load_model._quantize predicate
#1216
opened Apr 27, 2026 by
adurham
Contributor
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.