Skip to content

Add support for numpy array and other types to gr.Dataframe() initial value#2804

Merged
abidlabs merged 10 commits into
mainfrom
pd-orjson
Dec 13, 2022
Merged

Add support for numpy array and other types to gr.Dataframe() initial value#2804
abidlabs merged 10 commits into
mainfrom
pd-orjson

Conversation

@abidlabs
Copy link
Copy Markdown
Member

@abidlabs abidlabs commented Dec 13, 2022

As I was writing the BigQuery guide, I noticed that we would run into an error if we passed in a pd.DataFrame that contained numpy arrays or any non-builtin python types as the value of a gr.DataFrame. In other words, this would error out:

import pandas as pd
import numpy as np
import gradio as gr

df = pd.DataFrame(
    {
        "date_1": pd.date_range("2021-01-01", periods=2),
        "date_2": pd.date_range("2022-02-15", periods=2).strftime("%B %d, %Y, %r"),
        "number": np.array([0.2233, 0.57281]),
        "number_2": np.array([84, 23]).astype(np.int64),
        "bool": [True, False],
        "markdown": ["# Hello", "# Goodbye"],
    }
)

with gr.Blocks() as demo:
    gr.DataFrame(df)
demo.launch()

Interestingly, it would not error out if we returned such as pandas array from a function. The reason is that we were using different ways to serialize data before sending it to the frontend depending on if it was part of the config or not. This fixes that by using the same way (orjson.dumps) to serialize both. It should make the pd.Dataframe a lot more robust to different types. Also added a test and fixed some other tests which were incorrectly marked as async.

Closes: #2658

@abidlabs abidlabs changed the base branch from main to bigquery-guide December 13, 2022 06:51
@github-actions
Copy link
Copy Markdown
Contributor

All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-2804-all-demos

Copy link
Copy Markdown
Collaborator

@freddyaboulton freddyaboulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @abidlabs ! Thanks for the fix. This fixes #2658 no? I would add pd.DataFrame to the value typehint for dataframe since that's missing right now.

I wonder if we can simplify the datatype argument for 4.0. It supports "datetime" which you'd think would take care of dates but it doesn't do anything except do markdown conversion if the datatype arg is markdown. Just a random thought while reviewing.

@abidlabs
Copy link
Copy Markdown
Member Author

abidlabs commented Dec 13, 2022

Thanks for the review @freddyaboulton! I had missed that issue, so will add that in. I'll change the base branch to main so that we can merge this in

@abidlabs abidlabs changed the base branch from bigquery-guide to main December 13, 2022 22:51
@abidlabs abidlabs merged commit c126e62 into main Dec 13, 2022
@abidlabs abidlabs deleted the pd-orjson branch December 13, 2022 23:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow gr.DataFrame() to take all pandas dataframe types

2 participants