Fix duplicate columns in parquet export by abidlabs · Pull Request #484 · gradio-app/trackio

abidlabs · 2026-04-10T18:23:36Z

use pd.concat(..., axis=1) to avoid DataFrame fragmentation in the flattening path, avoids this warning in the terminal:

pandas PerformanceWarning: the DataFrame was highly fragmented...

Avoid duplicate column names when flattening JSON fields for parquet export and use concat to prevent DataFrame fragmentation during export. Made-with: Cursor

gradio-pr-bot · 2026-04-10T18:24:11Z

🦄 change detected

This Pull Request includes changes to the following packages.

Package	Version
`trackio`	`patch`

Fix duplicate columns in parquet export

‼️ Changeset not approved. Ensure the version bump is appropriate for all packages before approving.

Maintainers can approve the changeset by checking this checkbox.

Something isn't right?

Maintainers can change the version label to modify the version bump.
If the bot has failed to detect any changes, or if this pull request needs to update multiple packages to different versions or requires a more comprehensive changelog entry, maintainers can update the changelog file directly.

gradio-pr-bot · 2026-04-10T18:24:12Z

🪼 branch checks and previews

•	Name	Status	URL
🦄	Changes	detected!	Details

HuggingFaceDocBuilderDev · 2026-04-10T18:24:39Z

🪼 branch checks and previews

•	Name	Status	URL
	Spaces	ready!	Spaces preview

Install Trackio from this PR (includes built frontend)

pip install "https://huggingface.co/buckets/trackio/trackio-wheels/resolve/363b7d90358d308c3dea7d03245949a1a5cfa8a7/trackio-0.21.2-py3-none-any.whl"

HuggingFaceDocBuilderDev · 2026-04-10T18:26:09Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copilot

Pull request overview

Fixes Parquet export failures caused by duplicate columns when flattening JSON payloads (e.g., JSON keys colliding with structural table columns), and reduces DataFrame fragmentation in the flattening path.

Changes:

Drop flattened JSON keys that would duplicate existing DataFrame columns before export.
Use pd.concat(..., axis=1) to append expanded columns in one operation (avoids per-column assignment fragmentation).
Add a Changesets entry to release/version the change.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`trackio/sqlite_storage.py`	Updates `_flatten_json_column()` to filter duplicate keys and concatenate expanded columns efficiently before Parquet export.
`.changeset/funny-files-burn.md`	Adds a Changesets release note/version bump entry for the fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-10T18:31:38Z

-            df[c] = expanded[c]
-        return df
+        expanded = expanded.loc[:, ~expanded.columns.isin(df.columns)]
+        return pd.concat([df, expanded], axis=1)


The new duplicate-column drop behavior isn’t covered by existing tests. Please add/extend a unit test to exercise _flatten_json_column() when the JSON payload contains a key that collides with an existing structural column (e.g., run_name), and assert that (1) the structural column is not overwritten and (2) the resulting DataFrame has unique columns so to_parquet() succeeds.

Suggested change

return pd.concat([df, expanded], axis=1)

flattened = pd.concat([df, expanded], axis=1)

return flattened.loc[:, ~flattened.columns.duplicated()]

Copilot · 2026-04-10T18:31:39Z

+---
+"trackio": minor
+---
+
+feat:Fix duplicate columns in parquet export


This changeset reads like a bug fix (and the PR title/description says “Fix duplicate columns …”), but it’s marked as a minor release and categorized as feat:. That will both bump the version more than necessary and place the entry under “Features” in the changelog (see .changeset/changeset.cjs parsing ^(feat|fix|highlight)). Consider changing the bump to patch and the summary prefix to fix: (with a space after the colon).

znation

Looks good, thanks for fixing!

abidlabs and others added 2 commits April 10, 2026 11:23

fix parquet export for flattened config columns

e88360b

Avoid duplicate column names when flattening JSON fields for parquet export and use concat to prevent DataFrame fragmentation during export. Made-with: Cursor

add changeset

283a1f7

abidlabs requested review from Saba9, Copilot, qgallouedec and znation April 10, 2026 18:28

Copilot started reviewing on behalf of abidlabs April 10, 2026 18:29 View session

Copilot AI reviewed Apr 10, 2026

View reviewed changes

abidlabs added the v: patch label Apr 10, 2026

add changeset

363b7d9

abidlabs merged commit cc05ada into main Apr 10, 2026
8 of 9 checks passed

gradio-pr-bot mentioned this pull request Apr 10, 2026

chore: update versions #488

Merged

znation reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix duplicate columns in parquet export#484

Fix duplicate columns in parquet export#484
abidlabs merged 3 commits into
mainfrom
flatten

abidlabs commented Apr 10, 2026 •

edited

Loading

Uh oh!

gradio-pr-bot commented Apr 10, 2026 •

edited

Loading

Something isn't right?

Uh oh!

gradio-pr-bot commented Apr 10, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Uh oh!

znation left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

	return pd.concat([df, expanded], axis=1)
	flattened = pd.concat([df, expanded], axis=1)
	return flattened.loc[:, ~flattened.columns.duplicated()]

Conversation

abidlabs commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gradio-pr-bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦄 change detected

This Pull Request includes changes to the following packages.

Something isn't right?

Uh oh!

gradio-pr-bot commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪼 branch checks and previews

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🪼 branch checks and previews

Uh oh!

HuggingFaceDocBuilderDev commented Apr 10, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

znation left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

abidlabs commented Apr 10, 2026 •

edited

Loading

gradio-pr-bot commented Apr 10, 2026 •

edited

Loading

gradio-pr-bot commented Apr 10, 2026 •

edited

Loading

HuggingFaceDocBuilderDev commented Apr 10, 2026 •

edited

Loading