Skip to content

Traces in Trackio#518

Merged
abidlabs merged 19 commits into
mainfrom
trace-proposal
Apr 22, 2026
Merged

Traces in Trackio#518
abidlabs merged 19 commits into
mainfrom
trace-proposal

Conversation

@abidlabs
Copy link
Copy Markdown
Member

@abidlabs abidlabs commented Apr 18, 2026

Edit: see below: #518 (comment)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gradio-pr-bot
Copy link
Copy Markdown
Contributor

gradio-pr-bot commented Apr 18, 2026

🦄 change detected

This Pull Request includes changes to the following packages.

Package Version
trackio minor

  • Traces in Trackio

‼️ Changeset not approved. Ensure the version bump is appropriate for all packages before approving.

  • Maintainers can approve the changeset by checking this checkbox.

Something isn't right?

  • Maintainers can change the version label to modify the version bump.
  • If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

@gradio-pr-bot
Copy link
Copy Markdown
Contributor

gradio-pr-bot commented Apr 18, 2026

🪼 branch checks and previews

Name Status URL
🦄 Changes detected! Details

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

HuggingFaceDocBuilderDev commented Apr 18, 2026

🪼 branch checks and previews

Name Status URL
Spaces ready! Spaces preview

Install Trackio from this PR (includes built frontend)

pip install "https://huggingface.co/buckets/trackio/trackio-wheels/resolve/aa0f89bd16f476ebf7b7ea8e56544a89e3f148f5/trackio-0.24.2-py3-none-any.whl"

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Collaborator

@sergiopaniego sergiopaniego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal!!
One question: would it support rendering images?
possible use case: you're training with GRPO + an env (e.g., OpenEnv), and the env returns a list of images (e.g., a browser env returning screenshots). It'd be nice to render them inline with the messages

@qgallouedec
Copy link
Copy Markdown
Collaborator

Very cool! looking forward to integrate this!

@abidlabs abidlabs changed the title Proposal: trackio.Trace for GRPO rollout logging Traces in Trackio Apr 20, 2026
@abidlabs
Copy link
Copy Markdown
Member Author

Thanks for the proposal!!
One question: would it support rendering images?
possible use case: you're training with GRPO + an env (e.g., OpenEnv), and the env returns a list of images (e.g., a browser env returning screenshots). It'd be nice to render them inline with the messages

yep can do! We already support images in tables, so we should be able to do the same here

@abidlabs
Copy link
Copy Markdown
Member Author

Ok based on great feedback from everyone, have updated this PR.

Here's a basic example: python examples/traces/basic-trace.py

Screen.Recording.2026-04-20.at.2.00.32.PM.mov

(I've removed many of the earliers to make the UI less opinionated, thanks @adithya-s-k for the suggestion)

A more complex example including images and tool calls: python examples/traces/complex-trace.py

Screen.Recording.2026-04-20.at.2.03.29.PM.mov

cc @sergiopaniego @AmineDiro

And a potential example of how to use it with TRL: python examples/traces/trl-trace-integration.py (cc @qgallouedec)

Any other suggestions/improvements are welcome!

@abidlabs abidlabs marked this pull request as ready for review April 20, 2026 21:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class “trace” logging and a UI for browsing conversational/agent traces in Trackio, integrating with the existing metrics/log storage and dashboard routing.

Changes:

  • Introduce Trace payload type that serializes nested Trackio media inside messages/metadata.
  • Add SQLiteStorage.get_traces() + server API /get_traces to extract/search/sort traces from metric logs.
  • Add a new Svelte “Traces” page and navigation wiring (dynamic + static modes).

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
trackio/trace.py New Trace object with nested media serialization.
trackio/run.py Logs Trace instances and recursively queues nested media uploads.
trackio/sqlite_storage.py Extracts trace records from metric logs; supports search/sort/limit/offset.
trackio/server.py Exposes get_traces via the server API registry.
trackio/frontend/src/pages/Traces.svelte New UI page to list/search/sort and expand trace conversations.
trackio/frontend/src/lib/api.js Adds getTraces() client wrapper (static + server modes).
trackio/frontend/src/lib/staticApi.js Implements static-mode trace extraction/search/sort from exported logs.
trackio/frontend/src/lib/router.js Adds /traces route mapping.
trackio/frontend/src/components/Navbar.svelte Adds “Traces” nav link.
trackio/frontend/src/App.svelte Renders the Traces page and includes it in sidebar-enabled pages.
trackio/init.py Exports Trace from the top-level package API.
tests/unit/test_trace.py Unit coverage for trace serialization + storage search/sort.
tests/e2e-local/test_trace_e2e.py E2E round-trip test for logging and reading traces.
examples/traces/basic-trace.py Example: minimal trace logging.
examples/traces/complex-trace.py Example: rich trace with tool calls + images.
examples/traces/trl-trace-integration.py Example: TRL callback logging traces during training.
.changeset/easy-apes-hammer.md Changeset marking a minor feature release.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +242 to +248
const trace = {
id: `${normalizeRun(run).id || normalizeRun(run).name || "run"}:${log.step}:${key}${traceIndex !== null ? `:${traceIndex}` : ""}`,
key,
index: traceIndex,
run: normalizeRun(run).name,
run_id: normalizeRun(run).id,
step: log.step,
Comment thread trackio/run.py Outdated
Comment on lines +837 to +841
elif isinstance(value, Trace):
metrics[key] = value._to_dict(
project=self.project, run=self.name, step=step
)
self._scan_and_queue_media_uploads(metrics[key], step)
Comment thread trackio/sqlite_storage.py
offset: int = 0,
run_id: str | None = None,
) -> list[dict[str, Any]]:
logs = SQLiteStorage.get_logs(project, run, max_points=None, run_id=run_id)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point — this is a real concern for very large runs, but filtering server-side is non-trivial because trace payloads are stored inline inside metric rows (no separate trace index), so SQLite has no cheap way to skip non-trace rows without a schema change. The input normalization is addressed in the follow-up commit; I'd like to defer the scan-reduction work to a dedicated PR that introduces a lightweight trace index table so pagination/sort can be pushed down to SQL.

Comment thread trackio/sqlite_storage.py
Comment on lines +1992 to +1996
if offset > 0:
traces = traces[offset:]
if limit is not None:
traces = traces[:limit]

Comment thread trackio/server.py
Comment on lines +824 to +841
def get_traces(
project: str,
run: str | None = None,
run_id: str | None = None,
search: str | None = None,
sort: str | None = None,
limit: int | None = None,
offset: int | None = 0,
) -> list[dict[str, Any]]:
return SQLiteStorage.get_traces(
project,
run,
search=search,
sort=sort,
limit=limit,
offset=offset or 0,
run_id=run_id,
)
Comment on lines +252 to +257
{#each visibleTraces as trace}
<tr class="trace-row" onclick={() => toggleTrace(trace.id)}>
<td class="trace-id-cell">
<span class="trace-id">{trace.id}</span>
</td>
<td class="request-cell">
Comment on lines +46 to +70
async function loadTraces() {
if (!project || selectedRuns.length === 0) {
traces = [];
expandedTraceId = null;
return;
}

loading = true;
try {
const batches = await Promise.all(
selectedRuns.map(async (run) => {
const runTraces = await getTraces(project, run);
return runTraces.map((trace) => normalizeTrace(trace, run.name));
}),
);
traces = batches.flat();
if (!traces.find((trace) => trace.id === expandedTraceId)) {
expandedTraceId = null;
}
} catch (error) {
console.error("Failed to load traces:", error);
traces = [];
} finally {
loading = false;
}
- Cache normalizeRun result in staticApi getTraces
- Normalize step (None -> _next_step) before queuing trace/table media
- Validate offset/limit/sort inputs in server.get_traces and storage
- Make trace rows keyboard-accessible (role/tabindex/keydown)
- Guard Traces.svelte loadTraces against stale responses via request id
Copy link
Copy Markdown
Collaborator

@znation znation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good (I only skimmed, Claude reviewed more thoroughly). Left some optional comments for issues that Claude found.

Comment thread trackio/server.py
normalized_offset = max(0, int(offset)) if offset is not None else 0
except (TypeError, ValueError):
normalized_offset = 0
normalized_limit: int | None
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-sanitization of offset/limit

trackio/server.py:843-856 normalizes offset and limit, then passes them to trackio/sqlite_storage.py:1968-1974 which normalizes them again with identical
logic. One layer should own this.

Recommendation: Remove sanitization from sqlite_storage.py and let the API layer (server.py) be the sole validator. The storage layer can trust its internal
callers.

Comment thread trackio/run.py
self._queue_upload(absolute_path, step)
return
for nested in value.values():
self._scan_and_queue_media_uploads(nested, step)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recursive _scan_and_queue_media_uploads has no depth limit

trackio/run.py:767-786 — The refactored _scan_and_queue_media_uploads now recurses into arbitrary dicts/lists. A deeply nested trace payload (or even an
accidental circular reference via a custom dict) could blow the stack. The old version was bounded to exactly 2 levels of nesting (table rows → values →
list items).

Recommendation: Add a max_depth parameter (e.g., 10) and stop recursing beyond it. This matches the practical ceiling for trace messages.

Comment thread trackio/sqlite_storage.py
continue

trace_index = index if isinstance(value, list) else None
trace_id_parts = [run_id or run or "run", str(step), key]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trace ID collisions across runs

trackio/sqlite_storage.py:1934-1937 — Trace IDs are constructed as run_id_or_name:step:key[:index]. When run_id is None and run is None, the fallback is the
string "run". If two different runs are both queried with run=None, run_id=None, they'll produce identical trace IDs, causing collisions in the frontend (the
expand/collapse toggle uses trace.id).

The frontend in Traces.svelte:57-61 fetches traces for multiple selectedRuns, flattening them into one array. If two runs share a step number + key, the IDs
will collide.

Recommendation: Include the actual run name or run ID in the trace ID unconditionally (the caller always has it from the selectedRuns list), or generate a
unique ID (e.g., hash).

}
}

let visibleTraces = $derived.by(() => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Client-side search duplicates server-side search

Traces.svelte:79-96 — visibleTraces does full client-side filtering/sorting on the loaded traces. But getTraces in api.js also passes search/sort options to
the server. Currently loadTraces() at line 59 calls getTraces(project, run) with no options — so the server-side search/sort/pagination is never used from
the UI. The toolbar controls only drive the client-side $derived block.

This means the server endpoint accepts search/sort/limit/offset parameters that the frontend never sends. The two code paths (server-side in
sqlite_storage.py and client-side in Traces.svelte) are duplicated logic that can drift.

Recommendation: Either remove the unused server-side filtering (YAGNI) or wire it up in the frontend and remove the client-side duplicate. Given Issue 1,
moving filtering server-side would also be the path to fixing the performance problem.

<p>Try a different search query or model filter.</p>
</div>
{:else}
<div class="toolbar">
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Toolbar duplication in Traces.svelte

Traces.svelte:175-189 and Traces.svelte:198-213 — The toolbar markup (search input, sort dropdown, count display) is duplicated verbatim in both the "no
matching traces" and "has traces" branches. If you change one, you'll need to change the other.

Recommendation: Extract the toolbar into its own {#snippet} or move it above the conditional so it renders once regardless of whether traces match.

@abidlabs
Copy link
Copy Markdown
Member Author

Thanks so much for the review @znation! Cleaned up the frontend based on your comments, will merge this in once CI is green

@abidlabs abidlabs enabled auto-merge (squash) April 22, 2026 17:10
@abidlabs abidlabs merged commit e7ed176 into main Apr 22, 2026
8 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants