Skip to content
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
No changes to highlight.

## Bug Fixes:
No changes to highlight.
* Allows `gr.Dataframe()` to take a `pandas.DataFrame` that includes numpy array and other types as its initial value, by [@abidlabs](https://github.com/abidlabs) in [PR 2804](https://github.com/gradio-app/gradio/pull/2804)

## Documentation Changes:
No changes to highlight.
Expand Down
2 changes: 1 addition & 1 deletion gradio/components.py
Original file line number Diff line number Diff line change
Expand Up @@ -2455,7 +2455,7 @@ def postprocess(
"""
if y is None:
return self.postprocess(self.test_input)
if isinstance(y, Dict):
if isinstance(y, dict):
return y
if isinstance(y, str):
y = pd.read_csv(y)
Expand Down
20 changes: 19 additions & 1 deletion gradio/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
from fastapi.security import OAuth2PasswordRequestForm
from fastapi.templating import Jinja2Templates
from jinja2.exceptions import TemplateNotFound
from jinja2.utils import htmlsafe_json_dumps
from starlette.responses import RedirectResponse
from starlette.websockets import WebSocketState

Expand All @@ -55,11 +56,28 @@
class ORJSONResponse(JSONResponse):
media_type = "application/json"

@staticmethod
def _render(content: Any) -> bytes:
return orjson.dumps(
content,
option=orjson.OPT_SERIALIZE_NUMPY | orjson.OPT_PASSTHROUGH_DATETIME,
default=str,
)

def render(self, content: Any) -> bytes:
return orjson.dumps(content, option=orjson.OPT_SERIALIZE_NUMPY)
return ORJSONResponse._render(content)

@staticmethod
def _render_str(content: Any) -> str:
return ORJSONResponse._render(content).decode("utf-8")


def toorjson(value):
return htmlsafe_json_dumps(value, dumps=ORJSONResponse._render_str)


templates = Jinja2Templates(directory=STATIC_TEMPLATE_LIB)
templates.env.filters["toorjson"] = toorjson


###########
Expand Down
102 changes: 102 additions & 0 deletions guides/05_tabular_data_science_and_plots/using_gradio_with_bigquery.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# Using Gradio with Google BigQuery

Google BigQuery is a cloud-based big data analytics web service for processing very large data sets. It is a serverless and highly scalable data warehousing solution that enables users to analyze data using SQL-like queries.

In this tutorial, we will show you how to query a BigQuery dataset in Python and display the data *real-time* in a dashboard using `gradio`.

We'll be working with the New York Times' COVID dataset that is available on BigQuery. The dataset, named `covid19_nyt.us_counties` contains the latest information about the number of confirmed cases and deaths from COVID across US counties.

**Prerequisites**: This Guide uses [Gradio Blocks](../01_getting_started/01_quickstart.md#blocks-more-flexibility-and-control), so please familiarize yourself with the Blocks class.

## Setting up your BigQuery Credentials

To use Gradio with BigQuery, you will need to obtain your BigQuery credentials and use them with the BigQuery Python client. If you already have BigQuery credentials, you can skip this section. If not, you can do this for free very quickly:

1. First, log in to your Google Cloud account and go to the Google Cloud Console (https://console.cloud.google.com/)

1. In the Cloud Console, click on the hamburger menu in the top-left corner and select "APIs & Services" from the menu. If you do not have an existing project, you will need to create one.

1. Then, click the "+ Enabled APIs & services" button, which allows you to enable specific services for your project. Search for "BigQuery API", click on it, and click the "Enable" button. If you see the "Manage" button, then the BigQuery is already enabled, and you're all set.

1. In the APIs & Services menu, click on the "Credentials" tab and then click on the "Create credentials" button.

1. In the "Create credentials" dialog, select "Service account key" as the type of credentials to create, and give it a name. Also grant the service account permissions by giving it a role such as "BigQuery User", which will allow you to run queries.

1. After selecting the service account, select the "JSON" key type and then click on the "Create" button. This will download the JSON key file containing your credentials to your computer. It will look something like this:

```json
{
"type": "service_account",
"project_id": "your project",
"private_key_id": "your private key id",
"private_key": "private key",
"client_email": "email",
"client_id": "client id",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/email_id"
}
```

## Using the BigQuery Client

Once you have the credentials, you will need to use the BigQuery Python client to authenticate using your credentials. To do this, you will need to install the BigQuery Python client by running the following command in the terminal:

```
pip install google-cloud-bigquery[pandas]
```

You'll notice that we've installed the pandas add-on, which will be helpful for processing the BigQuery dataset as a pandas dataframe. Once the client is installed, you can authenticate using your credentials by running the following code:

```py
from google.cloud import bigquery

client = bigquery.Client.from_service_account_json("path/to/key.json")
```

With your credentials authenticated, you can now use the BigQuery Python client to interact with your BigQuery datasets.

Here is an example of a function which queries the `covid19_nyt.us_counties` dataset in BigQuery to show the top 20 counties with the most confirmed cases as of the current day:

```py
QUERY = (
'SELECT * FROM `bigquery-public-data.covid19_nyt.us_counties` '
'ORDER BY date DESC,confirmed_cases DESC '
'LIMIT 20')

def run_query():
query_job = client.query(QUERY)
query_result = query_job.result()
df = query_result.to_dataframe()
return df
```

## Building a Dashboard for Real-Time Data Analysis

Once you have a function to query the data, you can use the gr.DataFrame component from the Gradio library to display the results in a tabular format. This is a useful way to inspect the data and make sure that it has been queried correctly.

Here is an example of how to use the `gr.DataFrame` component to display the results. By passing in the `run_query` function to `gr.DataFrame`, we instruct Gradio to run the function as soon as the page loads and show the results. In addition, you also pass in the keyword `every` to tell the dashboard to refresh every hour (60*60 seconds).

```py
import gradio as gr

with gr.Blocks() as demo:
df = gr.DataFrame(run_query, every=60*60)

demo.queue().launch() # Run the demo using queuing
```

Perhaps you'd like to add a visualization to our dashboard. You can use the `gr.ScatterPlot()` component to visualize the data in a scatter plot. This allows you to see the relationship between different variables such as case count and case deaths in the dataset and can be useful for exploring the data and gaining insights.

Here is a complete example showing how to use the `gr.ScatterPlot` to visualize in addition to displaying data with the `gr.DataFrame`

```py
import gradio as gr

# create a ScatterPlot component
plot = gr.ScatterPlot(results, x="temperature", y="pressure")

# display the scatter plot
plot.launch()
```
32 changes: 11 additions & 21 deletions test/test_components.py
Original file line number Diff line number Diff line change
Expand Up @@ -551,8 +551,7 @@ async def test_in_interface(self):


class TestImage:
@pytest.mark.asyncio
async def test_component_functions(self):
def test_component_functions(self):
"""
Preprocess, postprocess, serialize, generate_sample, get_config, _segment_by_slic
type: pil, file, filepath, numpy
Expand Down Expand Up @@ -618,8 +617,7 @@ async def test_component_functions(self):
image_output = gr.Image(type="numpy")
assert image_output.postprocess(y_img).startswith("data:image/png;base64,")

@pytest.mark.asyncio
async def test_in_interface_as_input(self):
def test_in_interface_as_input(self):
"""
Interface, process, interpret
type: file
Expand All @@ -638,8 +636,7 @@ async def test_in_interface_as_input(self):
lambda x: np.sum(x), image_input, "number", interpretation="default"
)

@pytest.mark.asyncio
async def test_in_interface_as_output(self):
def test_in_interface_as_output(self):
"""
Interface, process
"""
Expand Down Expand Up @@ -789,8 +786,7 @@ def test_tokenize(self):
similarity = SequenceMatcher(a=x_wav["data"], b=x_new).ratio()
assert similarity > 0.9

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
def reverse_audio(audio):
sr, data = audio
return (sr, np.flipud(data))
Expand All @@ -806,8 +802,7 @@ def reverse_audio(audio):
).ratio()
assert similarity > 0.99

@pytest.mark.asyncio
async def test_in_interface_as_output(self):
def test_in_interface_as_output(self):
"""
Interface, process
"""
Expand Down Expand Up @@ -1007,7 +1002,7 @@ def test_dataframe_postprocess_all_types(self):
"%B %d, %Y, %r"
),
"number": np.array([0.2233, 0.57281]),
"number_2": np.array([84, 23]).astype(np.int),
"number_2": np.array([84, 23]).astype(np.int64),
"bool": [True, False],
"markdown": ["# Hello", "# Goodbye"],
}
Expand Down Expand Up @@ -1167,8 +1162,7 @@ def test_component_functions(self):
}
).endswith(".mp4")

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
"""
Interface, process
"""
Expand Down Expand Up @@ -1396,8 +1390,7 @@ def test_color_argument(self):
)
assert update_5["color"] == "transparent"

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
"""
Interface, process
"""
Expand Down Expand Up @@ -1640,8 +1633,7 @@ def test_component_functions(self):
"root_url": None,
} == html_component.get_config()

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
"""
Interface, process
"""
Expand All @@ -1660,8 +1652,7 @@ def test_component_functions(self):
"""<h1>Let\'s learn about <span class="math inline"><span style=\'font-size: 0px\'>x</span><svg xmlns:xlink="http://www.w3.org/1999/xlink" width="11.6pt" height="19.35625pt" viewBox="0 0 11.6 19.35625" xmlns="http://www.w3.org/2000/svg" version="1.1">\n \n <defs>\n <style type="text/css">*{stroke-linejoin: round; stroke-linecap: butt}</style>\n </defs>\n <g id="figure_1">\n <g id="patch_1">\n <path d="M 0 19.35625"""
)

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
"""
Interface, process
"""
Expand Down Expand Up @@ -1693,8 +1684,7 @@ def test_component_functions(self):
"style": {},
} == component.get_config()

@pytest.mark.asyncio
async def test_in_interface(self):
def test_in_interface(self):
"""
Interface, process
"""
Expand Down
23 changes: 23 additions & 0 deletions test/test_routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
import sys
from unittest.mock import patch

import numpy as np
import pandas as pd
import pytest
import starlette.routing
import websockets
Expand Down Expand Up @@ -398,3 +400,24 @@ def test_show_api_queue_not_enabled():
io.close()
io.launch(prevent_thread_lock=True, show_api=False)
assert not io.show_api


def test_orjson_serialization():
df = pd.DataFrame(
{
"date_1": pd.date_range("2021-01-01", periods=2),
"date_2": pd.date_range("2022-02-15", periods=2).strftime("%B %d, %Y, %r"),
"number": np.array([0.2233, 0.57281]),
"number_2": np.array([84, 23]).astype(np.int64),
"bool": [True, False],
"markdown": ["# Hello", "# Goodbye"],
}
)

with gr.Blocks() as demo:
gr.DataFrame(df)
app, _, _ = demo.launch(prevent_thread_lock=True)
test_client = TestClient(app)
response = test_client.get("/")
assert response.status_code == 200
demo.close()
2 changes: 1 addition & 1 deletion ui/packages/app/build_plugins.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ export function inject_ejs(): Plugin {
transformIndexHtml: (html) => {
return html.replace(
/%gradio_config%/,
`<script>window.gradio_config = {{ config | tojson }};</script>`
`<script>window.gradio_config = {{ config | toorjson }};</script>`
);
}
};
Expand Down