Open WebUI BOLA: search_knowledge_files Builtin Tool Knowledge Base Access Without Authorization Check
Contributors: huangweigang
1. Impact Scope
Open WebUI (latest)
https://github.com/open-webui/open-webui
2. Vulnerable Endpoint
Builtin tool function: search_knowledge_files(knowledge_id=...) — invoked via LLM native function calling when params.function_calling is set to "native" and the model has no attached knowledge (__model_knowledge__ is empty).
3. Code Analysis
File: backend/open_webui/tools/builtin.py
Function & route:
async def search_knowledge_files(
query: str,
knowledge_id: Optional[str] = None,
count: int = 5,
skip: int = 0,
__request__: Request = None,
__user__: dict = None,
__model_knowledge__: Optional[list[dict]] = None,
) -> str:
Key code (lines 1690–1698, "No attached knowledge" branch):
# No attached knowledge - search all accessible KBs
if knowledge_id:
result = await Knowledges.search_files_by_id(
knowledge_id=knowledge_id,
user_id=user_id,
filter={'query': query},
skip=skip,
limit=count,
)
Contrast with the secure implementation in the same file — query_knowledge_files (lines 2155–2170):
elif knowledge_ids:
for knowledge_id in knowledge_ids:
knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
if knowledge and (
user_role == 'admin'
or knowledge.user_id == user_id
or await AccessGrants.has_access(
user_id=user_id,
resource_type='knowledge',
resource_id=knowledge.id,
permission='read',
user_group_ids=set(user_group_ids),
)
):
collection_names.append(knowledge_id)
And the secure implementation in the same function's "has attached knowledge" branch (lines 1640–1651):
if not (
user_role == 'admin'
or knowledge.user_id == user_id
or await AccessGrants.has_access(
user_id=user_id,
resource_type='knowledge',
resource_id=knowledge.id,
permission='read',
user_group_ids=set(user_group_ids),
)
):
continue
Problem points:
- The "No attached knowledge" branch at L1691–1698 directly calls
Knowledges.search_files_by_id(knowledge_id=knowledge_id, ...) without verifying whether the current user has read access to the specified knowledge base
Knowledges.search_files_by_id (in backend/open_webui/models/knowledge.py L451–526) accepts user_id but only uses it for view_option filtering (L484–486), not for authorization — it queries all files belonging to the given knowledge_id regardless of the caller's permissions
- The same function's "has attached knowledge" branch (L1640–1651) correctly performs
AccessGrants.has_access() checks before accessing each KB
- The sibling function
query_knowledge_files (L2155–2170) also correctly validates access before using user-specified knowledge_ids
- This inconsistency means
search_knowledge_files is the only code path that accesses a knowledge base by ID without authorization, creating a BOLA vulnerability
4. Reproduction
-- Prerequisites
- Attacker has a valid authenticated session on Open WebUI
- The target model has no attached knowledge bases (i.e.,
model.info.meta.knowledge is empty or absent), so that the "No attached knowledge" branch is entered
- The model's
builtin_tools capability is enabled (default: true)
- The
knowledge builtin tool category is enabled in the system configuration
- Attacker knows the
knowledge_id (UUID) of a target knowledge base they do not have access to (obtainable through prior access revocation, information leakage from other vulnerabilities, or shared context)
-- Steps (Unauthorized Knowledge Base File Listing)
- Send a chat completion request with
params.function_calling set to "native" and a prompt instructing the LLM to call search_knowledge_files with the target knowledge_id:
{
"stream": true,
"model": "gpt-4o-mini",
"params": {
"function_calling": "native"
},
"messages": [
{
"role": "user",
"content": "Please use the search_knowledge_files tool with knowledge_id \"c0c84752-2e9d-42bf-bc3c-c0f272aa61c1\" to search all files"
}
]
}
-
The server processes the request through the following chain:
middleware.py:L2837 — params.function_calling == "native" is true → get_builtin_tools() is called
utils/tools.py:L455-466 — model has no attached knowledge → search_knowledge_files is registered as an available builtin tool
- LLM generates a tool call:
search_knowledge_files(knowledge_id="c0c84752-2e9d-42bf-bc3c-c0f272aa61c1", query="")
middleware.py:L1382 — tool_result = await tool_function(**tool_function_params) invokes the function
builtin.py:L1691-1698 — "No attached knowledge" branch is entered, Knowledges.search_files_by_id() is called without authorization check
-
Observation: The function returns the file list of the target knowledge base, including:
- File ID (
id)
- File name (
filename)
- Knowledge base ID (
knowledge_id)
- Knowledge base name (
knowledge_name)
- Update timestamp (
updated_at)
-
Verification:
- Use the knowledge base owner's account to confirm the returned files belong to their private KB
- Check
AccessGrants table to confirm the attacker has no read permission on the target knowledge base
- Compare with
query_knowledge_files which correctly denies access to the same knowledge_id
5. Impact
-
Unauthorized knowledge base metadata disclosure
Attackers can enumerate file names, IDs, and metadata of knowledge bases they have no access to, even if the KB is private or restricted to specific user groups.
-
Information leakage enabling further attacks
File names may reveal sensitive business information (e.g., "Q3-financial-report.pdf", "employee-salaries.xlsx"). File IDs can be used as inputs to other tools (e.g., view_knowledge_file) to attempt content extraction.
-
Broken access control consistency
The vulnerability undermines the AccessGrants permission model. All other knowledge-base access paths (the "has attached knowledge" branch in the same function, query_knowledge_files, the /api/v1/knowledge/ REST endpoints, and the filter_accessible_collections function in RAG retrieval) correctly enforce authorization. This single missing check creates an inconsistency that violates the principle of defense in depth.
-
Post-revocation access
If a user's access to a knowledge base is revoked, they can still enumerate its files via search_knowledge_files if they remember the knowledge_id, bypassing the intended access revocation.
6. Remediation
-
Add authorization check in the "No attached knowledge" branch
Validate the user's access to the specified knowledge_id before calling search_files_by_id, consistent with the pattern used in query_knowledge_files (L2155–2170) and the "has attached knowledge" branch (L1640–1651):
# No attached knowledge - search all accessible KBs
if knowledge_id:
knowledge = await Knowledges.get_knowledge_by_id(knowledge_id)
if not knowledge or not (
user_role == 'admin'
or knowledge.user_id == user_id
or await AccessGrants.has_access(
user_id=user_id,
resource_type='knowledge',
resource_id=knowledge.id,
permission='read',
user_group_ids=set(user_group_ids),
)
):
return json.dumps({'error': f'Access denied to knowledge base {knowledge_id}'})
result = await Knowledges.search_files_by_id(
knowledge_id=knowledge_id,
user_id=user_id,
filter={'query': query},
skip=skip,
limit=count,
)
-
Add authorization check in Knowledges.search_files_by_id model method
As defense in depth, add an optional access verification in the data layer (backend/open_webui/models/knowledge.py L451) so that even if a caller forgets to check, the model method itself enforces access control.
-
Audit all builtin tool functions for similar BOLA patterns
Systematically review all builtin tools in backend/open_webui/tools/builtin.py that accept resource IDs as parameters to ensure consistent authorization checks, particularly:
view_file (L1733)
view_knowledge_file (L1849)
search_chats (L1928)
view_chat (L1980)
Open WebUI BOLA:
search_knowledge_filesBuiltin Tool Knowledge Base Access Without Authorization CheckContributors: huangweigang
1. Impact Scope
Open WebUI (latest)
https://github.com/open-webui/open-webui2. Vulnerable Endpoint
Builtin tool function:
search_knowledge_files(knowledge_id=...)— invoked via LLM native function calling whenparams.function_callingis set to"native"and the model has no attached knowledge (__model_knowledge__is empty).3. Code Analysis
File:
backend/open_webui/tools/builtin.pyFunction & route:
Key code (lines 1690–1698, "No attached knowledge" branch):
Contrast with the secure implementation in the same file —
query_knowledge_files(lines 2155–2170):And the secure implementation in the same function's "has attached knowledge" branch (lines 1640–1651):
Problem points:
Knowledges.search_files_by_id(knowledge_id=knowledge_id, ...)without verifying whether the current user has read access to the specified knowledge baseKnowledges.search_files_by_id(inbackend/open_webui/models/knowledge.pyL451–526) acceptsuser_idbut only uses it forview_optionfiltering (L484–486), not for authorization — it queries all files belonging to the givenknowledge_idregardless of the caller's permissionsAccessGrants.has_access()checks before accessing each KBquery_knowledge_files(L2155–2170) also correctly validates access before using user-specifiedknowledge_idssearch_knowledge_filesis the only code path that accesses a knowledge base by ID without authorization, creating a BOLA vulnerability4. Reproduction
-- Prerequisites
model.info.meta.knowledgeis empty or absent), so that the "No attached knowledge" branch is enteredbuiltin_toolscapability is enabled (default:true)knowledgebuiltin tool category is enabled in the system configurationknowledge_id(UUID) of a target knowledge base they do not have access to (obtainable through prior access revocation, information leakage from other vulnerabilities, or shared context)-- Steps (Unauthorized Knowledge Base File Listing)
params.function_callingset to"native"and a prompt instructing the LLM to callsearch_knowledge_fileswith the targetknowledge_id:{ "stream": true, "model": "gpt-4o-mini", "params": { "function_calling": "native" }, "messages": [ { "role": "user", "content": "Please use the search_knowledge_files tool with knowledge_id \"c0c84752-2e9d-42bf-bc3c-c0f272aa61c1\" to search all files" } ] }The server processes the request through the following chain:
middleware.py:L2837—params.function_calling == "native"is true →get_builtin_tools()is calledutils/tools.py:L455-466— model has no attached knowledge →search_knowledge_filesis registered as an available builtin toolsearch_knowledge_files(knowledge_id="c0c84752-2e9d-42bf-bc3c-c0f272aa61c1", query="")middleware.py:L1382—tool_result = await tool_function(**tool_function_params)invokes the functionbuiltin.py:L1691-1698— "No attached knowledge" branch is entered,Knowledges.search_files_by_id()is called without authorization checkObservation: The function returns the file list of the target knowledge base, including:
id)filename)knowledge_id)knowledge_name)updated_at)Verification:
AccessGrantstable to confirm the attacker has noreadpermission on the target knowledge basequery_knowledge_fileswhich correctly denies access to the sameknowledge_id5. Impact
Unauthorized knowledge base metadata disclosure
Attackers can enumerate file names, IDs, and metadata of knowledge bases they have no access to, even if the KB is private or restricted to specific user groups.
Information leakage enabling further attacks
File names may reveal sensitive business information (e.g., "Q3-financial-report.pdf", "employee-salaries.xlsx"). File IDs can be used as inputs to other tools (e.g.,
view_knowledge_file) to attempt content extraction.Broken access control consistency
The vulnerability undermines the
AccessGrantspermission model. All other knowledge-base access paths (the "has attached knowledge" branch in the same function,query_knowledge_files, the/api/v1/knowledge/REST endpoints, and thefilter_accessible_collectionsfunction in RAG retrieval) correctly enforce authorization. This single missing check creates an inconsistency that violates the principle of defense in depth.Post-revocation access
If a user's access to a knowledge base is revoked, they can still enumerate its files via
search_knowledge_filesif they remember theknowledge_id, bypassing the intended access revocation.6. Remediation
Add authorization check in the "No attached knowledge" branch
Validate the user's access to the specified
knowledge_idbefore callingsearch_files_by_id, consistent with the pattern used inquery_knowledge_files(L2155–2170) and the "has attached knowledge" branch (L1640–1651):Add authorization check in
Knowledges.search_files_by_idmodel methodAs defense in depth, add an optional access verification in the data layer (
backend/open_webui/models/knowledge.pyL451) so that even if a caller forgets to check, the model method itself enforces access control.Audit all builtin tool functions for similar BOLA patterns
Systematically review all builtin tools in
backend/open_webui/tools/builtin.pythat accept resource IDs as parameters to ensure consistent authorization checks, particularly:view_file(L1733)view_knowledge_file(L1849)search_chats(L1928)view_chat(L1980)