Respect Upstream Queue when loading interfaces/blocks from Spaces#2294
Conversation
|
All the demos for this PR have been deployed at https://huggingface.co/spaces/gradio-pr-deploys/pr-2294-all-demos |
|
Thanks @freddyaboulton this looks great! Tested both with the downstream app enabling queue explicitly and with the downstream app not enabling queuing, like this: import gradio as gr
io = gr.Interface.load("spaces/freddyaboulton/saymyname")
print(io("foo"))
io.launch()In both cases, the upstream queue is respected. Also tested with some upstream apps that don't have queue. The one wrinkle that we should address is that in a Blocks demo, the upstream app may enable queuing for some functions, but not all. This can happen by enabling the queue by default, but then disabling for some specific functions, or vice versa. The current implementation will only look at the default queuing value in the upstream app. Instead, when we iterate through the dependencies, it would be good to check for that specific function, queuing is enabled. |
There was a problem hiding this comment.
I believe older versions of Gradio did not include the version in the config, and so it would be good to check for that first so that a KeyError is not thrown
There was a problem hiding this comment.
Good point!
There was a problem hiding this comment.
I'm confused, as there's as AsyncMock in the previous tests which are not skipped?
There was a problem hiding this comment.
The problem is that websockets.connect returns an async context manager and mock doesn't really work with async context managers in 3.7. I can make the reason a bit more precise.
There was a problem hiding this comment.
Got it thanks for clarifying! Good to know
|
Good stuff! Left some more comments, mostly nits. The main thing is to respect upstream queue per-function as opposed to the entire app, as I mentioned above. |
e23941c to
a9831c5
Compare
|
Thanks for the review @abidlabs and good catch about honoring the queue per function. Made that change - this should be good for another look! |
|
Tested and LGTM @freddyaboulton! This is awesome |
49bc4f2 to
f6b8756
Compare
|
@freddyaboulton this looks pretty good! I think adding some tests to the queue would be great using a similar approach. |
Description
Closes: #1316
The approach taken is to open up a websocket connection to the space for each prediction request sent to the loaded space. The main limitation of this approach is that the loaded app doesn't display information about the request's position in the original app's queue. However, there are a couple of reasons why I went with this approach:
gr.Interface.load().launchworkflows but not more complex uses ofgr.Interface.load.How to test,
Launch the following app:
Then go to this space: https://huggingface.co/spaces/freddyaboulton/saymyname
Launch three simultaneous requests but make sure the first two are on the HF space. On the app running locally, it should take around ~15 seconds to complete the request.
Checklist: