Add server_url parameter for self-hosting Trackio servers#510
Conversation
Closes #498 Introduces a new `trackio_url` kwarg on `trackio.init()` (and matching TRACKIO_SERVER_URL env var) that logs metrics to a self-hosted Trackio HTTP endpoint instead of a Hugging Face Space. Mutually exclusive with `space_id`. The value must be a full http(s) URL; HF-specific knobs (`space_storage`, `dataset_id`, `bucket_id`, Space creation/metadata) are skipped when a URL is provided. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
🪼 branch checks and previews
|
🦄 change detectedThis Pull Request includes changes to the following packages.
|
🪼 branch checks and previews
Install Trackio from this PR (includes built frontend) pip install "https://huggingface.co/buckets/trackio/trackio-wheels/resolve/69940315434ae949e874d16fd63eab60a8fabcc5/trackio-0.23.0-py3-none-any.whl" |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
server_url parameter for self-hosting Trackio servers
There was a problem hiding this comment.
Pull request overview
This PR adds a first-class way to point trackio.init() at a self-hosted Trackio HTTP server (via server_url / TRACKIO_SERVER_URL) as an alternative to Hugging Face Space-backed remote logging.
Changes:
- Add
server_url+ env var resolution and branch theinit()flow to use a URL as the remote source (skipping Space creation). - Add an e2e test that starts a local
trackio.show()server and logs to it viaserver_url. - Expand docs (including a new “Self-host the Server” page) and add a changeset entry for the new feature.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| trackio/utils.py | Adds helper to resolve space_id vs TRACKIO_SERVER_URL with precedence rules. |
| trackio/init.py | Adds server_url parameter, validation, and remote init behavior using Space ID or URL. |
| tests/e2e-local/test_api.py | Adds e2e coverage for logging to a locally hosted server via server_url. |
| docs/source/track.md | Documents remote logging via Space or self-hosted server. |
| docs/source/self_hosted_server.md | New guide for running the server and pointing training scripts at it. |
| docs/source/index.md | Updates marketing bullets to mention self-hosted server support. |
| docs/source/environment_variables.md | Documents TRACKIO_SERVER_URL. |
| docs/source/_toctree.yml | Adds the self-hosting page to the docs nav. |
| README.md | Adds a “Self-hosted Trackio server” section and updates feature bullets. |
| .changeset/vast-yaks-wait.md | Declares a minor release for the new server_url feature. |
Comments suppressed due to low confidence (1)
.changeset/vast-yaks-wait.md:6
- Changeset message formatting: other changesets use
feat: ...with a space after the colon. Please update this entry to match the repo’s convention (and consider splitting into a short title + details if desired).
feat:Add server_url and TRACKIO_SERVER_URL for self-hosted servers; space_id and TRACKIO_SPACE_ID take precedence when both are set
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| remote_client = RemoteClient( | ||
| space_id, | ||
| remote_source, | ||
| hf_token=huggingface_hub.utils.get_token(), |
There was a problem hiding this comment.
Security: when remote_source is a user-provided URL (server_url), passing huggingface_hub.utils.get_token() into RemoteClient will send the user's Hugging Face token as an HTTP Authorization header to that server. Please avoid attaching HF tokens when the remote target is not a Hugging Face Space (e.g., pass hf_token=None for URLs and only use HF tokens for resolved HF Space sources). Also ensure the fallback RemoteClient construction paths in the _safe_get_*_for_init helpers follow the same rule.
| hf_token=huggingface_hub.utils.get_token(), | |
| hf_token=None | |
| if server_url is not None | |
| else huggingface_hub.utils.get_token(), |
| Full URL of a self-hosted Trackio server (HTTP or HTTPS), **including the `write_token` query parameter** (same URL as the dashboard’s write-access link or the `full_url` from `trackio.show()`). When set, `trackio.init()` sends metrics to that server. Equivalent to passing `server_url=` to `trackio.init()`. | ||
|
|
||
| **Precedence:** If `TRACKIO_SPACE_ID` is also set (or `space_id` is passed in code), the Hugging Face Space is used and `TRACKIO_SERVER_URL` is ignored. Same rule when both `space_id` and `server_url` are passed: `space_id` wins. | ||
|
|
||
| See [Self-host the Server](self_hosted_server.md). | ||
|
|
||
| ```bash | ||
| export TRACKIO_SERVER_URL="http://127.0.0.1:7860?write_token=YOUR_TOKEN" |
There was a problem hiding this comment.
This env var section states the URL must include write_token, but the current HTTP client drops base URL query strings when issuing requests and the self-hosted server doesn't require write_token for metric ingestion. Please clarify what the token is actually used for (mutations vs logging) or update the client/server so the token is enforced and propagated.
| Full URL of a self-hosted Trackio server (HTTP or HTTPS), **including the `write_token` query parameter** (same URL as the dashboard’s write-access link or the `full_url` from `trackio.show()`). When set, `trackio.init()` sends metrics to that server. Equivalent to passing `server_url=` to `trackio.init()`. | |
| **Precedence:** If `TRACKIO_SPACE_ID` is also set (or `space_id` is passed in code), the Hugging Face Space is used and `TRACKIO_SERVER_URL` is ignored. Same rule when both `space_id` and `server_url` are passed: `space_id` wins. | |
| See [Self-host the Server](self_hosted_server.md). | |
| ```bash | |
| export TRACKIO_SERVER_URL="http://127.0.0.1:7860?write_token=YOUR_TOKEN" | |
| Base URL of a self-hosted Trackio server (HTTP or HTTPS). When set, `trackio.init()` sends metrics to that server. Equivalent to passing `server_url=` to `trackio.init()`. | |
| Do **not** rely on a `write_token` query parameter in `TRACKIO_SERVER_URL` for metric ingestion. Write-access links (such as the dashboard’s write-access URL or the `full_url` from `trackio.show()`) may include `write_token`, but that token is for write-enabled dashboard / mutation flows rather than the logging path configured by this environment variable. | |
| **Precedence:** If `TRACKIO_SPACE_ID` is also set (or `space_id` is passed in code), the Hugging Face Space is used and `TRACKIO_SERVER_URL` is ignored. Same rule when both `space_id` and `server_url` are passed: `space_id` wins. | |
| See [Self-host the Server](self_hosted_server.md). | |
| ```bash | |
| export TRACKIO_SERVER_URL="http://127.0.0.1:7860" |
| ## Self-hosted Trackio server | ||
|
|
||
| You can run the Trackio dashboard and API on your own machine or infrastructure and point training jobs at it over HTTP. Pass the **full URL including the `write_token` query** (as printed when the server starts, or use the `full_url` return value from `trackio.show()`): | ||
|
|
||
| ```py | ||
| trackio.init(project="my-project", server_url="http://127.0.0.1:7860?write_token=YOUR_TOKEN") | ||
| ``` | ||
|
|
||
| You can also set `TRACKIO_SERVER_URL` to that full URL instead of passing `server_url`. If `space_id` / `TRACKIO_SPACE_ID` and `server_url` / `TRACKIO_SERVER_URL` are both set, Trackio uses the Hugging Face Space and ignores the self-hosted URL. |
There was a problem hiding this comment.
README instructs passing a server_url with write_token, but the current RemoteClient drops URL query params when constructing API requests, so that token will not reach endpoints that require it (e.g., rename/delete). Please either implement token propagation/enforcement for self-hosted servers or adjust the README wording to match current behavior (token only for UI/mutations, not for metric logging).
| Full URL of a self-hosted Trackio server, including the ``write_token`` query | ||
| parameter (for example the ``full_url`` value returned by ``trackio.show()``, | ||
| or the write-access URL printed when the dashboard starts). Logging and other | ||
| remote calls require that token; a base URL without ``write_token`` is not | ||
| enough. Example: | ||
| ``"https://trackio.internal.example.com?write_token=..."``. When set, metrics are sent to | ||
| that server over HTTP instead of creating or syncing to a Hugging Face | ||
| Space. Can also be set via the `TRACKIO_SERVER_URL` environment variable. | ||
| Ignored when `space_id` or `TRACKIO_SPACE_ID` is set. |
There was a problem hiding this comment.
The server_url docstring says the write_token query parameter is required for logging/remote calls, but (1) the self-hosted server's /bulk_log endpoints currently don't require a write_token, and (2) RemoteClient builds request URLs with urljoin, which drops the base URL query string—so a write_token in server_url would not be sent to endpoints that do require it (e.g., delete/rename). Please either implement token propagation/auth for server_url or adjust the docs to describe what the token actually gates.
| Full URL of a self-hosted Trackio server, including the ``write_token`` query | |
| parameter (for example the ``full_url`` value returned by ``trackio.show()``, | |
| or the write-access URL printed when the dashboard starts). Logging and other | |
| remote calls require that token; a base URL without ``write_token`` is not | |
| enough. Example: | |
| ``"https://trackio.internal.example.com?write_token=..."``. When set, metrics are sent to | |
| that server over HTTP instead of creating or syncing to a Hugging Face | |
| Space. Can also be set via the `TRACKIO_SERVER_URL` environment variable. | |
| Ignored when `space_id` or `TRACKIO_SPACE_ID` is set. | |
| Base URL of a self-hosted Trackio server. For example: | |
| ``"https://trackio.internal.example.com"``. If you are using a write-access | |
| URL returned by ``trackio.show()`` or printed when the dashboard starts, it | |
| may also include a ``write_token`` query parameter, for example | |
| ``"https://trackio.internal.example.com?write_token=..."``. Depending on the | |
| server configuration, that token may be required for some write/admin | |
| operations, but it is not required for all logging requests. When set, | |
| metrics are sent to that server over HTTP instead of creating or syncing to | |
| a Hugging Face Space. Can also be set via the `TRACKIO_SERVER_URL` | |
| environment variable. Ignored when `space_id` or `TRACKIO_SPACE_ID` is set. |
|
Going to merge this in so that we can test all of the changes in conjunction with each other |
Summary
Closes #498.
Adds a
server_urlkwarg ontrackio.init()(and matchingTRACKIO_SERVER_URLenv var) as a first-class alternative tospace_id. When set, metrics are sent to the HTTP endpoint directly — no Hugging Face Space is created or synced to. This is intended for users who self-host a Trackio server (trackio show --host 0.0.0.0) on their own infrastructure.