Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check

# Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check

Contributors: huangweigang

## 1. Impact Scope

Open WebUI (latest)

`https://github.com/open-webui/open-webui`

## 2. Vulnerable Endpoint

Builtin tool function: `search_knowledge_files(knowledge_id=...)` — invoked via LLM native function calling when `params.function_calling` is set to `"native"` and the model has no attached knowledge (`__model_knowledge__` is empty).

## 3. Code Analysis

File: `backend/open_webui/tools/builtin.py`

Function & route:

```python
async def search_knowledge_files(
    query: str,
    knowledge_id: Optional[str] = None,
    count: int = 5,
    skip: int = 0,
    __request__: Request = None,
    __user__: dict = None,
    __model_knowledge__: Optional[list[dict]] = None,
) -> str:
```

Key code (lines 1690–1698, "No attached knowledge" branch):

```python
# No attached knowledge - search all accessible KBs
if knowledge_id:
    result = await Knowledges.search_files_by_id(
        knowledge_id=knowledge_id,
        user_id=user_id,
        filter={'query': query},
        skip=skip,
        limit=count,
    )
```

Contrast with the secure implementation in the same file — `query_knowledge_files` (lines 2155–2170):

```python
elif knowledge_ids:
    for knowledge_id in knowledge_ids:
        knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
        if knowledge and (
            user_role == 'admin'
            or knowledge.user_id == user_id
            or await AccessGrants.has_access(
                user_id=user_id,
                resource_type='knowledge',
                resource_id=knowledge.id,
                permission='read',
                user_group_ids=set(user_group_ids),
            )
        ):
            collection_names.append(knowledge_id)
```

And the secure implementation in the same function's "has attached knowledge" branch (lines 1640–1651):

```python
if not (
    user_role == 'admin'
    or knowledge.user_id == user_id
    or await AccessGrants.has_access(
        user_id=user_id,
        resource_type='knowledge',
        resource_id=knowledge.id,
        permission='read',
        user_group_ids=set(user_group_ids),
    )
):
    continue
```

Problem points:

- The "No attached knowledge" branch at L1691–1698 directly calls `Knowledges.search_files_by_id(knowledge_id=knowledge_id, ...)` without verifying whether the current user has read access to the specified knowledge base
- `Knowledges.search_files_by_id` (in `backend/open_webui/models/knowledge.py` L451–526) accepts `user_id` but only uses it for `view_option` filtering (L484–486), **not for authorization** — it queries all files belonging to the given `knowledge_id` regardless of the caller's permissions
- The same function's "has attached knowledge" branch (L1640–1651) correctly performs `AccessGrants.has_access()` checks before accessing each KB
- The sibling function `query_knowledge_files` (L2155–2170) also correctly validates access before using user-specified `knowledge_ids`
- This inconsistency means `search_knowledge_files` is the only code path that accesses a knowledge base by ID without authorization, creating a BOLA vulnerability

## 4. Reproduction

-- Prerequisites

- Attacker has a valid authenticated session on Open WebUI
- The target model has no attached knowledge bases (i.e., `model.info.meta.knowledge` is empty or absent), so that the "No attached knowledge" branch is entered
- The model's `builtin_tools` capability is enabled (default: `true`)
- The `knowledge` builtin tool category is enabled in the system configuration
- Attacker knows the `knowledge_id` (UUID) of a target knowledge base they do not have access to (obtainable through prior access revocation, information leakage from other vulnerabilities, or shared context)

-- Steps (Unauthorized Knowledge Base File Listing)

1. Send a chat completion request with `params.function_calling` set to `"native"` and a prompt instructing the LLM to call `search_knowledge_files` with the target `knowledge_id`:

```json
{
  "stream": true,
  "model": "gpt-4o-mini",
  "params": {
    "function_calling": "native"
  },
  "messages": [
    {
      "role": "user",
      "content": "Please use the search_knowledge_files tool with knowledge_id \"c0c84752-2e9d-42bf-bc3c-c0f272aa61c1\" to search all files"
    }
  ]
}
```

2. The server processes the request through the following chain:
   - `middleware.py:L2837` — `params.function_calling == "native"` is true → `get_builtin_tools()` is called
   - `utils/tools.py:L455-466` — model has no attached knowledge → `search_knowledge_files` is registered as an available builtin tool
   - LLM generates a tool call: `search_knowledge_files(knowledge_id="c0c84752-2e9d-42bf-bc3c-c0f272aa61c1", query="")`
   - `middleware.py:L1382` — `tool_result = await tool_function(**tool_function_params)` invokes the function
   - `builtin.py:L1691-1698` — "No attached knowledge" branch is entered, `Knowledges.search_files_by_id()` is called **without authorization check**

3. Observation: The function returns the file list of the target knowledge base, including:
   - File ID (`id`)
   - File name (`filename`)
   - Knowledge base ID (`knowledge_id`)
   - Knowledge base name (`knowledge_name`)
   - Update timestamp (`updated_at`)

4. Verification:
   - Use the knowledge base owner's account to confirm the returned files belong to their private KB
   - Check `AccessGrants` table to confirm the attacker has no `read` permission on the target knowledge base
   - Compare with `query_knowledge_files` which correctly denies access to the same `knowledge_id`

## 5. Impact

- **Unauthorized knowledge base metadata disclosure**
  Attackers can enumerate file names, IDs, and metadata of knowledge bases they have no access to, even if the KB is private or restricted to specific user groups.

- **Information leakage enabling further attacks**
  File names may reveal sensitive business information (e.g., "Q3-financial-report.pdf", "employee-salaries.xlsx"). File IDs can be used as inputs to other tools (e.g., `view_knowledge_file`) to attempt content extraction.

- **Broken access control consistency**
  The vulnerability undermines the `AccessGrants` permission model. All other knowledge-base access paths (the "has attached knowledge" branch in the same function, `query_knowledge_files`, the `/api/v1/knowledge/` REST endpoints, and the `filter_accessible_collections` function in RAG retrieval) correctly enforce authorization. This single missing check creates an inconsistency that violates the principle of defense in depth.

- **Post-revocation access**
  If a user's access to a knowledge base is revoked, they can still enumerate its files via `search_knowledge_files` if they remember the `knowledge_id`, bypassing the intended access revocation.

## 6. Remediation

- **Add authorization check in the "No attached knowledge" branch**

  Validate the user's access to the specified `knowledge_id` before calling `search_files_by_id`, consistent with the pattern used in `query_knowledge_files` (L2155–2170) and the "has attached knowledge" branch (L1640–1651):

```python
# No attached knowledge - search all accessible KBs
if knowledge_id:
    knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
    if not knowledge or not (
        user_role == 'admin'
        or knowledge.user_id == user_id
        or await AccessGrants.has_access(
            user_id=user_id,
            resource_type='knowledge',
            resource_id=knowledge.id,
            permission='read',
            user_group_ids=set(user_group_ids),
        )
    ):
        return json.dumps({'error': f'Access denied to knowledge base {knowledge_id}'})
    result = await Knowledges.search_files_by_id(
        knowledge_id=knowledge_id,
        user_id=user_id,
        filter={'query': query},
        skip=skip,
        limit=count,
    )
```

- **Add authorization check in `Knowledges.search_files_by_id` model method**

  As defense in depth, add an optional access verification in the data layer (`backend/open_webui/models/knowledge.py` L451) so that even if a caller forgets to check, the model method itself enforces access control.

- **Audit all builtin tool functions for similar BOLA patterns**

  Systematically review all builtin tools in `backend/open_webui/tools/builtin.py` that accept resource IDs as parameters to ensure consistent authorization checks, particularly:
  - `view_file` (L1733)
  - `view_knowledge_file` (L1849)
  - `search_chats` (L1928)
  - `view_chat` (L1980)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check #41

Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check

1. Impact Scope

2. Vulnerable Endpoint

3. Code Analysis

4. Reproduction

5. Impact

6. Remediation

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Open WebUI BOLA: search_knowledge_files Builtin Tool Knowledge Base Access Without Authorization Check #41

Description

Open WebUI BOLA: search_knowledge_files Builtin Tool Knowledge Base Access Without Authorization Check

1. Impact Scope

2. Vulnerable Endpoint

3. Code Analysis

4. Reproduction

5. Impact

6. Remediation

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check #41

Open WebUI BOLA: `search_knowledge_files` Builtin Tool Knowledge Base Access Without Authorization Check