Skip to content

Commit b3a8eb5

Browse files
Wauplinclaudehanouticelina
authored
[CLI] Add hf spaces search command with semantic search (#4094)
* [CLI] Add `hf spaces search` command with semantic search Add `search_spaces()` method to `HfApi` and `hf spaces search <query>` CLI command that calls the Hub's semantic search API (`/api/spaces/semantic-search`). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * add search_spaces first class citizen * update SpaceSearchResult creation * allow multiple sdks * Update docs/source/en/guides/manage-spaces.md Co-authored-by: célina <hanouticelina@gmail.com> * Update docs/source/en/guides/manage-spaces.md Co-authored-by: célina <hanouticelina@gmail.com> * better examples * beter examples --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: célina <hanouticelina@gmail.com>
1 parent d6f252c commit b3a8eb5

9 files changed

Lines changed: 272 additions & 1 deletion

File tree

docs/source/en/guides/manage-spaces.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,30 @@ In this guide, we will see how to manage your Space runtime
88
([secrets](https://huggingface.co/docs/hub/spaces-overview#managing-secrets),
99
[hardware](https://huggingface.co/docs/hub/spaces-gpus), and volumes) using `huggingface_hub`.
1010

11+
## Search for Spaces
12+
13+
You can search for Spaces on the Hub using semantic search with [`search_spaces`]. This uses embedding-based search for multi-word queries and full-text search for single-word queries.
14+
15+
```py
16+
>>> from huggingface_hub import search_spaces
17+
>>> results = list(search_spaces("generate image"))
18+
>>> results[0]
19+
SpaceSearchResult(id='mrfakename/Z-Image-Turbo', title='Z Image Turbo', sdk='gradio', likes=2867, ...)
20+
```
21+
22+
You can filter results by SDK or tags:
23+
24+
```py
25+
>>> results = search_spaces("chatbot", sdk="gradio", filter="mcp-server")
26+
```
27+
28+
Or via CLI:
29+
30+
```bash
31+
>>> hf spaces search "generate image"
32+
>>> hf spaces search "chatbot" --sdk gradio --limit 5
33+
```
34+
1135
## A simple example: configure secrets and hardware.
1236

1337
Here is an end-to-end example to create and set up a Space on the Hub.

docs/source/en/guides/repository.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,19 @@ RepoUrl('https://huggingface.co/spaces/nateraw/dreambooth-training',...)
121121
RepoUrl('https://huggingface.co/datasets/nateraw/gdpval',...)
122122
```
123123

124+
## Search for Spaces
125+
126+
The Hub provides a semantic search API for discovering Spaces. You can search using natural language queries with [`search_spaces`]:
127+
128+
```py
129+
>>> from huggingface_hub import search_spaces
130+
>>> results = list(search_spaces("generate image"))
131+
>>> results[0].id
132+
'mrfakename/Z-Image-Turbo'
133+
```
134+
135+
For more details and filtering options, see the [Manage your Spaces](./manage-spaces#search-for-spaces) guide.
136+
124137
## Upload and download files
125138

126139
Now that you have created your repository, you are interested in pushing changes to it and downloading files from it.

docs/source/en/package_reference/cli.md

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3403,6 +3403,7 @@ $ hf spaces [OPTIONS] COMMAND [ARGS]...
34033403
* `hot-reload`: Hot-reload any Python file of a Space...
34043404
* `info`: Get info about a space on the Hub.
34053405
* `list`: List spaces on the Hub. [alias: ls]
3406+
* `search`: Search spaces on the Hub using semantic...
34063407

34073408
### `hf spaces dev-mode`
34083409

@@ -3546,6 +3547,41 @@ Learn more
35463547
Read the documentation at https://huggingface.co/docs/huggingface_hub/en/guides/cli
35473548

35483549

3550+
### `hf spaces search`
3551+
3552+
Search spaces on the Hub using semantic search.
3553+
3554+
**Usage**:
3555+
3556+
```console
3557+
$ hf spaces search [OPTIONS] QUERY
3558+
```
3559+
3560+
**Arguments**:
3561+
3562+
* `QUERY`: Search query. [required]
3563+
3564+
**Options**:
3565+
3566+
* `--filter TEXT`: Filter by tags (e.g. 'text-classification'). Can be used multiple times.
3567+
* `--sdk TEXT`: Filter by SDK (e.g. gradio, docker, static).
3568+
* `--include-non-running / --no-include-non-running`: Include non-running spaces in results. [default: no-include-non-running]
3569+
* `--description / --no-description`: Show AI-generated descriptions. [default: no-description]
3570+
* `--limit INTEGER`: Limit the number of results. [default: 10]
3571+
* `--format [agent|auto|human|json|quiet]`: Output format. [default: auto]
3572+
* `--token TEXT`: A User Access Token generated from https://huggingface.co/settings/tokens.
3573+
* `--help`: Show this message and exit.
3574+
3575+
Examples
3576+
$ hf spaces search "generate image"
3577+
$ hf spaces search "identify objects in pictures" --sdk gradio --limit 5
3578+
$ hf spaces search "remove background from photo" --description --json
3579+
3580+
Learn more
3581+
Use `hf <command> --help` for more information about a command.
3582+
Read the documentation at https://huggingface.co/docs/huggingface_hub/en/guides/cli
3583+
3584+
35493585
## `hf sync`
35503586

35513587
Sync files between local directory and a bucket.

docs/source/en/package_reference/hf_api.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,10 @@ models = hf_api.list_models()
133133

134134
[[autodoc]] huggingface_hub.hf_api.SpaceInfo
135135

136+
### SpaceSearchResult
137+
138+
[[autodoc]] huggingface_hub._space_api.SpaceSearchResult
139+
136140
### TensorInfo
137141

138142
[[autodoc]] huggingface_hub.utils.TensorInfo

src/huggingface_hub/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -189,6 +189,7 @@
189189
"RepoFolder",
190190
"RepoUrl",
191191
"SpaceInfo",
192+
"SpaceSearchResult",
192193
"User",
193194
"UserLikes",
194195
"WebhookInfo",
@@ -322,6 +323,7 @@
322323
"run_job",
323324
"run_uv_job",
324325
"scale_to_zero_inference_endpoint",
326+
"search_spaces",
325327
"set_space_sleep_time",
326328
"set_space_volumes",
327329
"space_info",
@@ -794,6 +796,7 @@
794796
"SpaceHardware",
795797
"SpaceInfo",
796798
"SpaceRuntime",
799+
"SpaceSearchResult",
797800
"SpaceStage",
798801
"SpaceStorage",
799802
"SpaceVariable",
@@ -1059,6 +1062,7 @@
10591062
"save_torch_state_dict",
10601063
"scale_to_zero_inference_endpoint",
10611064
"scan_cache_dir",
1065+
"search_spaces",
10621066
"set_async_client_factory",
10631067
"set_client_factory",
10641068
"set_space_sleep_time",
@@ -1319,6 +1323,7 @@ def __dir__():
13191323
RepoFolder, # noqa: F401
13201324
RepoUrl, # noqa: F401
13211325
SpaceInfo, # noqa: F401
1326+
SpaceSearchResult, # noqa: F401
13221327
User, # noqa: F401
13231328
UserLikes, # noqa: F401
13241329
WebhookInfo, # noqa: F401
@@ -1452,6 +1457,7 @@ def __dir__():
14521457
run_job, # noqa: F401
14531458
run_uv_job, # noqa: F401
14541459
scale_to_zero_inference_endpoint, # noqa: F401
1460+
search_spaces, # noqa: F401
14551461
set_space_sleep_time, # noqa: F401
14561462
set_space_volumes, # noqa: F401
14571463
space_info, # noqa: F401

src/huggingface_hub/_space_api.py

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -249,3 +249,69 @@ def __init__(self, key: str, values: dict) -> None:
249249
self.description = values.get("description")
250250
updated_at = values.get("updatedAt")
251251
self.updated_at = parse_datetime(updated_at) if updated_at is not None else None
252+
253+
254+
@dataclass
255+
class SpaceSearchResult:
256+
"""A single result from the Spaces semantic search API.
257+
258+
Returned by [`HfApi.search_spaces`].
259+
260+
Attributes:
261+
id (`str`):
262+
ID of the Space (e.g. `"username/repo-name"`).
263+
author (`str`):
264+
Author of the Space.
265+
title (`str`):
266+
Display title of the Space.
267+
emoji (`str` or `None`):
268+
Emoji icon of the Space.
269+
sdk (`str` or `None`):
270+
SDK used by the Space (e.g. `"gradio"`, `"docker"`, `"static"`).
271+
likes (`int`):
272+
Number of likes.
273+
private (`bool`):
274+
Whether the Space is private.
275+
tags (`list[str]` or `None`):
276+
List of tags.
277+
runtime ([`SpaceRuntime`] or `None`):
278+
Runtime information (stage, hardware, etc.).
279+
ai_short_description (`str` or `None`):
280+
AI-generated short description.
281+
ai_category (`str` or `None`):
282+
AI-generated category (e.g. `"Image Generation"`).
283+
semantic_relevancy_score (`float` or `None`):
284+
Semantic relevancy score (0-1) relative to the search query.
285+
trending_score (`int` or `None`):
286+
Trending score.
287+
"""
288+
289+
id: str
290+
author: str
291+
title: str
292+
emoji: str | None
293+
sdk: str | None
294+
likes: int
295+
private: bool
296+
tags: list[str] | None
297+
runtime: SpaceRuntime | None
298+
ai_short_description: str | None
299+
ai_category: str | None
300+
semantic_relevancy_score: float | None
301+
trending_score: int | None
302+
303+
def __init__(self, data: dict) -> None:
304+
runtime = data.get("runtime")
305+
self.id = data["id"]
306+
self.author = data.get("author", "")
307+
self.title = data.get("title", "")
308+
self.emoji = data.get("emoji")
309+
self.sdk = data.get("sdk")
310+
self.likes = data.get("likes", 0)
311+
self.private = data.get("private", False)
312+
self.tags = data.get("tags")
313+
self.runtime = SpaceRuntime(runtime) if runtime else None
314+
self.ai_short_description = data.get("ai_short_description")
315+
self.ai_category = data.get("ai_category")
316+
self.semantic_relevancy_score = data.get("semanticRelevancyScore")
317+
self.trending_score = data.get("trendingScore")

src/huggingface_hub/cli/spaces.py

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@
2626

2727
import enum
2828
import functools
29+
import itertools
2930
import os
3031
import shlex
3132
import shutil
@@ -144,6 +145,52 @@ def spaces_info(
144145
out.dict(info)
145146

146147

148+
@spaces_cli.command(
149+
"search",
150+
examples=[
151+
'hf spaces search "generate image"',
152+
'hf spaces search "identify objects in pictures" --sdk gradio --limit 5',
153+
'hf spaces search "remove background from photo" --description --json',
154+
],
155+
)
156+
def spaces_search(
157+
query: Annotated[str, typer.Argument(help="Search query.")],
158+
filter: FilterOpt = None,
159+
sdk: Annotated[list[str] | None, typer.Option(help="Filter by SDK (e.g. gradio, docker, static).")] = None,
160+
include_non_running: Annotated[bool, typer.Option(help="Include non-running spaces in results.")] = False,
161+
description: Annotated[bool, typer.Option(help="Show AI-generated descriptions.")] = False,
162+
limit: LimitOpt = 10,
163+
format: FormatWithAutoOpt = OutputFormatWithAuto.auto,
164+
token: TokenOpt = None,
165+
) -> None:
166+
"""Search spaces on the Hub using semantic search."""
167+
api = get_hf_api(token=token)
168+
results = api.search_spaces(
169+
query=query,
170+
filter=filter,
171+
sdk=sdk,
172+
include_non_running=include_non_running,
173+
token=token,
174+
)
175+
items = []
176+
for r in itertools.islice(results, limit):
177+
item: dict = {
178+
"id": r.id,
179+
"title": r.title,
180+
"sdk": r.sdk,
181+
"likes": r.likes,
182+
"stage": r.runtime.stage if r.runtime else None,
183+
"category": r.ai_category,
184+
"score": round(r.semantic_relevancy_score, 2) if r.semantic_relevancy_score is not None else None,
185+
}
186+
if description:
187+
item["description"] = r.ai_short_description
188+
items.append(item)
189+
out.table(items)
190+
if not description:
191+
out.hint("Use --description to show AI-generated descriptions.")
192+
193+
147194
@spaces_cli.command(
148195
"dev-mode",
149196
examples=[

src/huggingface_hub/hf_api.py

Lines changed: 61 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -72,7 +72,7 @@
7272
from ._eval_results import EvalResultEntry, parse_eval_result_entries
7373
from ._inference_endpoints import InferenceEndpoint, InferenceEndpointScalingMetric, InferenceEndpointType
7474
from ._jobs_api import JobHardware, JobInfo, JobSpec, ScheduledJobInfo, _create_job_spec
75-
from ._space_api import SpaceHardware, SpaceRuntime, SpaceStorage, SpaceVariable, Volume
75+
from ._space_api import SpaceHardware, SpaceRuntime, SpaceSearchResult, SpaceStorage, SpaceVariable, Volume
7676
from ._upload_large_folder import upload_large_folder_internal
7777
from .community import (
7878
Discussion,
@@ -2905,6 +2905,65 @@ def list_spaces(
29052905
item["siblings"] = None
29062906
yield SpaceInfo(**item)
29072907

2908+
@validate_hf_hub_args
2909+
def search_spaces(
2910+
self,
2911+
query: str,
2912+
*,
2913+
filter: str | Iterable[str] | None = None,
2914+
sdk: str | list[str] | None = None,
2915+
include_non_running: bool = False,
2916+
token: bool | str | None = None,
2917+
) -> Iterable[SpaceSearchResult]:
2918+
"""Search Spaces on the Hub using semantic search.
2919+
2920+
This endpoint uses semantic search (embedding-based) for multi-word queries
2921+
and full-text search for single-word queries.
2922+
2923+
Args:
2924+
query (`str`):
2925+
The search query string.
2926+
filter (`str` or `Iterable[str]`, *optional*):
2927+
A string tag or list of tags to filter by.
2928+
sdk (`str` or `list[str]`, *optional*):
2929+
Filter by SDK (e.g. `"gradio"`, `"docker"`, `"static"`).
2930+
include_non_running (`bool`, *optional*):
2931+
Whether to include non-running Spaces in results. Defaults to `False`.
2932+
token (`bool` or `str`, *optional*):
2933+
A valid user access token (string). Defaults to the locally saved
2934+
token, which is the recommended method for authentication (see
2935+
https://huggingface.co/docs/huggingface_hub/quick-start#authentication).
2936+
To disable authentication, pass `False`.
2937+
2938+
Returns:
2939+
`Iterable[SpaceSearchResult]`: an iterable of [`SpaceSearchResult`] objects.
2940+
2941+
Example:
2942+
```python
2943+
>>> from huggingface_hub import HfApi
2944+
>>> api = HfApi()
2945+
>>> results = list(api.search_spaces("generate image"))
2946+
>>> results[0].id
2947+
'mrfakename/Z-Image-Turbo'
2948+
>>> results[0].ai_category
2949+
'Image Generation'
2950+
```
2951+
"""
2952+
path = f"{self.endpoint}/api/spaces/semantic-search"
2953+
headers = self._build_hf_headers(token=token)
2954+
params: dict[str, Any] = {"q": query}
2955+
if filter is not None:
2956+
params["filter"] = filter
2957+
if sdk is not None:
2958+
params["sdk"] = sdk
2959+
if include_non_running:
2960+
params["includeNonRunning"] = True
2961+
2962+
r = get_session().get(path, headers=headers, params=params)
2963+
hf_raise_for_status(r)
2964+
for item in r.json():
2965+
yield SpaceSearchResult(item)
2966+
29082967
@validate_hf_hub_args
29092968
def unlike(
29102969
self,
@@ -13581,6 +13640,7 @@ def get_local_safetensors_metadata(path: str | Path) -> SafetensorsRepoMetadata:
1358113640
get_dataset_leaderboard = api.get_dataset_leaderboard
1358213641

1358313642
list_spaces = api.list_spaces
13643+
search_spaces = api.search_spaces
1358413644
space_info = api.space_info
1358513645

1358613646
kernel_info = api.kernel_info

0 commit comments

Comments
 (0)