Skip to content

Open WebUI BOLA: search_knowledge_files Builtin Tool Knowledge Base Access Without Authorization Check #41

@Hwwg

Description

@Hwwg

Open WebUI BOLA: search_knowledge_files Builtin Tool Knowledge Base Access Without Authorization Check

Contributors: huangweigang

1. Impact Scope

Open WebUI (latest)

https://github.com/open-webui/open-webui

2. Vulnerable Endpoint

Builtin tool function: search_knowledge_files(knowledge_id=...) — invoked via LLM native function calling when params.function_calling is set to "native" and the model has no attached knowledge (__model_knowledge__ is empty).

3. Code Analysis

File: backend/open_webui/tools/builtin.py

Function & route:

async def search_knowledge_files(
    query: str,
    knowledge_id: Optional[str] = None,
    count: int = 5,
    skip: int = 0,
    __request__: Request = None,
    __user__: dict = None,
    __model_knowledge__: Optional[list[dict]] = None,
) -> str:

Key code (lines 1690–1698, "No attached knowledge" branch):

# No attached knowledge - search all accessible KBs
if knowledge_id:
    result = await Knowledges.search_files_by_id(
        knowledge_id=knowledge_id,
        user_id=user_id,
        filter={'query': query},
        skip=skip,
        limit=count,
    )

Contrast with the secure implementation in the same file — query_knowledge_files (lines 2155–2170):

elif knowledge_ids:
    for knowledge_id in knowledge_ids:
        knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
        if knowledge and (
            user_role == 'admin'
            or knowledge.user_id == user_id
            or await AccessGrants.has_access(
                user_id=user_id,
                resource_type='knowledge',
                resource_id=knowledge.id,
                permission='read',
                user_group_ids=set(user_group_ids),
            )
        ):
            collection_names.append(knowledge_id)

And the secure implementation in the same function's "has attached knowledge" branch (lines 1640–1651):

if not (
    user_role == 'admin'
    or knowledge.user_id == user_id
    or await AccessGrants.has_access(
        user_id=user_id,
        resource_type='knowledge',
        resource_id=knowledge.id,
        permission='read',
        user_group_ids=set(user_group_ids),
    )
):
    continue

Problem points:

  • The "No attached knowledge" branch at L1691–1698 directly calls Knowledges.search_files_by_id(knowledge_id=knowledge_id, ...) without verifying whether the current user has read access to the specified knowledge base
  • Knowledges.search_files_by_id (in backend/open_webui/models/knowledge.py L451–526) accepts user_id but only uses it for view_option filtering (L484–486), not for authorization — it queries all files belonging to the given knowledge_id regardless of the caller's permissions
  • The same function's "has attached knowledge" branch (L1640–1651) correctly performs AccessGrants.has_access() checks before accessing each KB
  • The sibling function query_knowledge_files (L2155–2170) also correctly validates access before using user-specified knowledge_ids
  • This inconsistency means search_knowledge_files is the only code path that accesses a knowledge base by ID without authorization, creating a BOLA vulnerability

4. Reproduction

-- Prerequisites

  • Attacker has a valid authenticated session on Open WebUI
  • The target model has no attached knowledge bases (i.e., model.info.meta.knowledge is empty or absent), so that the "No attached knowledge" branch is entered
  • The model's builtin_tools capability is enabled (default: true)
  • The knowledge builtin tool category is enabled in the system configuration
  • Attacker knows the knowledge_id (UUID) of a target knowledge base they do not have access to (obtainable through prior access revocation, information leakage from other vulnerabilities, or shared context)

-- Steps (Unauthorized Knowledge Base File Listing)

  1. Send a chat completion request with params.function_calling set to "native" and a prompt instructing the LLM to call search_knowledge_files with the target knowledge_id:
{
  "stream": true,
  "model": "gpt-4o-mini",
  "params": {
    "function_calling": "native"
  },
  "messages": [
    {
      "role": "user",
      "content": "Please use the search_knowledge_files tool with knowledge_id \"c0c84752-2e9d-42bf-bc3c-c0f272aa61c1\" to search all files"
    }
  ]
}
  1. The server processes the request through the following chain:

    • middleware.py:L2837params.function_calling == "native" is true → get_builtin_tools() is called
    • utils/tools.py:L455-466 — model has no attached knowledge → search_knowledge_files is registered as an available builtin tool
    • LLM generates a tool call: search_knowledge_files(knowledge_id="c0c84752-2e9d-42bf-bc3c-c0f272aa61c1", query="")
    • middleware.py:L1382tool_result = await tool_function(**tool_function_params) invokes the function
    • builtin.py:L1691-1698 — "No attached knowledge" branch is entered, Knowledges.search_files_by_id() is called without authorization check
  2. Observation: The function returns the file list of the target knowledge base, including:

    • File ID (id)
    • File name (filename)
    • Knowledge base ID (knowledge_id)
    • Knowledge base name (knowledge_name)
    • Update timestamp (updated_at)
  3. Verification:

    • Use the knowledge base owner's account to confirm the returned files belong to their private KB
    • Check AccessGrants table to confirm the attacker has no read permission on the target knowledge base
    • Compare with query_knowledge_files which correctly denies access to the same knowledge_id

5. Impact

  • Unauthorized knowledge base metadata disclosure
    Attackers can enumerate file names, IDs, and metadata of knowledge bases they have no access to, even if the KB is private or restricted to specific user groups.

  • Information leakage enabling further attacks
    File names may reveal sensitive business information (e.g., "Q3-financial-report.pdf", "employee-salaries.xlsx"). File IDs can be used as inputs to other tools (e.g., view_knowledge_file) to attempt content extraction.

  • Broken access control consistency
    The vulnerability undermines the AccessGrants permission model. All other knowledge-base access paths (the "has attached knowledge" branch in the same function, query_knowledge_files, the /api/v1/knowledge/ REST endpoints, and the filter_accessible_collections function in RAG retrieval) correctly enforce authorization. This single missing check creates an inconsistency that violates the principle of defense in depth.

  • Post-revocation access
    If a user's access to a knowledge base is revoked, they can still enumerate its files via search_knowledge_files if they remember the knowledge_id, bypassing the intended access revocation.

6. Remediation

  • Add authorization check in the "No attached knowledge" branch

    Validate the user's access to the specified knowledge_id before calling search_files_by_id, consistent with the pattern used in query_knowledge_files (L2155–2170) and the "has attached knowledge" branch (L1640–1651):

# No attached knowledge - search all accessible KBs
if knowledge_id:
    knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
    if not knowledge or not (
        user_role == 'admin'
        or knowledge.user_id == user_id
        or await AccessGrants.has_access(
            user_id=user_id,
            resource_type='knowledge',
            resource_id=knowledge.id,
            permission='read',
            user_group_ids=set(user_group_ids),
        )
    ):
        return json.dumps({'error': f'Access denied to knowledge base {knowledge_id}'})
    result = await Knowledges.search_files_by_id(
        knowledge_id=knowledge_id,
        user_id=user_id,
        filter={'query': query},
        skip=skip,
        limit=count,
    )
  • Add authorization check in Knowledges.search_files_by_id model method

    As defense in depth, add an optional access verification in the data layer (backend/open_webui/models/knowledge.py L451) so that even if a caller forgets to check, the model method itself enforces access control.

  • Audit all builtin tool functions for similar BOLA patterns

    Systematically review all builtin tools in backend/open_webui/tools/builtin.py that accept resource IDs as parameters to ensure consistent authorization checks, particularly:

    • view_file (L1733)
    • view_knowledge_file (L1849)
    • search_chats (L1928)
    • view_chat (L1980)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions