v1.1.0: Python SDK, audio/image generation, logging overhaul, Modelfiles #3
Replies: 1 comment
-
NovaMLX v1.1.0 Discussion ReplyPaste-Ready GitHub ReplyGreat release — the Python SDK + local audio/image endpoints make this a much more complete local OpenAI-compatible stack.
A few engineering notes that might help tighten the v1.1.0 rollout/docs:
### 1. Clarify the default server port
The release examples use:
```bash
http://localhost:6591for admin endpoints, but the README quick start says the server runs on: localhost:8080If both are valid depending on install mode/config, it may be worth adding one line like: Default server URL: http://localhost:8080
Admin examples below assume NOVA_MLX_URL=http://localhost:6591 or a custom configured port.That would prevent users from thinking the admin API or Modelfile API is broken when they are only hitting the wrong port. 2. Python SDK versioning may be confusingThe release is name = "novamlx"
version = "0.1.0"If the SDK is intentionally versioned separately from the app/server, I would document that explicitly. Otherwise, aligning the SDK package version with the release tag would make debugging and issue reports easier: python -c "import importlib.metadata as m; print(m.version('novamlx'))"That way, a user can quickly report: instead of mixing app and SDK versions. 3. Suggested SDK smoke-test sectionSince cd sdk/python
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"
python - <<'PY'
import novamlx
from novamlx import Client, AdminClient
print("novamlx import OK")
print("Client:", Client)
print("AdminClient:", AdminClient)
PYThen a minimal chat smoke test: python examples/basic_chat.pyand a streaming smoke test: python examples/streaming.pyThis gives users a fast way to separate: 4. Audio/image endpoint examples would be usefulThe new local endpoints are one of the strongest parts of this release: It would be helpful to include copy-paste For audio transcription: curl http://localhost:8080/v1/audio/transcriptions \
-H "Authorization: Bearer $KEY" \
-F "file=@sample.wav" \
-F "model=qwen3-asr"For image generation: curl http://localhost:8080/v1/images/generations \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "sdxl-turbo",
"prompt": "A clean product shot of a futuristic local AI server running on Apple Silicon",
"size": "1024x1024",
"n": 1
}'Even if the exact model names differ internally, having the expected request shape documented would make the endpoints much easier to test. 5. Modelfiles: document the resulting model nameThe Modelfile system looks very useful. One thing I would make explicit is how the created Modelfile appears in For example, after: curl -X POST http://localhost:6591/admin/modelfiles/create \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "math-tutor",
"from": "mlx-community/Qwen3-27B-4bit",
"system": "You are a math tutor.",
"parameters": {
"temperature": 0.3
}
}'can users immediately call it like this? curl http://localhost:8080/v1/chat/completions \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "math-tutor",
"messages": [
{
"role": "user",
"content": "Explain eigenvalues in simple terms."
}
]
}'If yes, showing that second request would make the feature much clearer. 6. Runtime log-level API is a strong operations featureThe runtime log-level endpoint is a nice ops improvement: curl -X PUT http://localhost:6591/admin/api/log-level \
-H "Authorization: Bearer $KEY" \
-H "Content-Type: application/json" \
-d '{"level": 0}'One useful addition would be a read-back endpoint or example, something like: curl http://localhost:6591/admin/api/log-level \
-H "Authorization: Bearer $KEY"That would let operators confirm the current runtime level after changing it. 7. Security/admin API noteBecause this release adds more admin surface area — Modelfiles, runtime log level, auto-load behavior, Tokenhub integration — I would add a short security note in the docs: Do not expose the admin API directly to the public internet.
Bind locally or put it behind authentication/reverse proxy controls.
Rotate the admin key if it is leaked.This matters especially because Modelfiles can influence model behavior, prompts, sampling parameters, and what appears in OverallThis looks like a major release: SDK, audio, image generation, Modelfiles, log rotation, runtime log levels, The main things I would polish are:
Nice work on the release. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
NovaMLX v1.1.0 is out. This release adds a Python SDK, local audio transcription and image generation, user-authored Modelfiles, and reworks the logging system. Download (DMG or tar.gz).
Python SDK —
sdk/python/ships a full client with type hints, streaming support, admin API, tool calling, and thinking model examples. Install viapip install -e sdk/python/.Audio transcription & image generation — Two new endpoints running locally through MLX:
POST /v1/audio/transcriptions— Qwen3-ASR, supports WAV/MP3/M4A/FLACPOST /v1/images/generations— SDXL-Turbo, ~2s per image on M-seriesModelfile system — Define model recipes with system prompts and sampling overrides in a single file. Modelfiles appear as models in
/v1/models:Logging overhaul — Log files now rotate (keeps last 5) instead of truncating. Runtime log level is adjustable without restart via admin API:
Levels: 0=debug, 1=info (default), 2=warning, 3=error.
Other additions:
keep_alive— override model TTL per requestreasoning_effortparameter — OpenAI-standard thinking budget controllogprobsandtop_logprobssupportnova.capabilitiesexposed on/v1/modelsScripts/test-all-models.sh)architecture.md)Bug fixes:
tool_callsandtool_call_id(OpenAI + Anthropic)prompt_tokensplumbed through to usage statsBeta Was this translation helpful? Give feedback.
All reactions