Description
1. Summary
The Agentic Assistant feature in Langflow executes LLM-generated Python code during its validation phase. Although this phase appears intended to validate generated component code, the implementation reaches dynamic execution sinks and instantiates the generated class server-side.
In deployments where an attacker can access the Agentic Assistant feature and influence the model output, this can result in arbitrary server-side Python execution.
2. Description
2.1 Intended Functionality
The Agentic Assistant endpoints are designed to help users generate and validate components for a flow. Users can submit requests to the assistant, which returns candidate component code for further processing.
A reasonable security expectation is that validation should treat model output as untrusted text and perform only static or side-effect-free checks.
The externally reachable endpoints are:
|
@router.post("/assist") |
|
async def assist( |
|
request: AssistantRequest, |
|
current_user: CurrentActiveUser, |
|
session: DbSession, |
|
) -> dict: |
|
"""Chat with the Langflow Assistant.""" |
|
ctx = await _resolve_assistant_context(request, current_user.id, session) |
|
|
|
logger.info(f"Executing {LANGFLOW_ASSISTANT_FLOW} with {ctx.provider}/{ctx.model_name}") |
|
|
|
return await execute_flow_with_validation( |
|
flow_filename=LANGFLOW_ASSISTANT_FLOW, |
|
input_value=request.input_value or "", |
|
global_variables=ctx.global_vars, |
|
max_retries=ctx.max_retries, |
|
user_id=str(current_user.id), |
|
session_id=ctx.session_id, |
|
provider=ctx.provider, |
|
model_name=ctx.model_name, |
|
api_key_var=ctx.api_key_name, |
|
) |
|
|
|
|
|
@router.post("/assist/stream") |
|
async def assist_stream( |
|
request: AssistantRequest, |
|
current_user: CurrentActiveUser, |
|
session: DbSession, |
|
) -> StreamingResponse: |
|
"""Chat with the Langflow Assistant with streaming progress updates.""" |
|
ctx = await _resolve_assistant_context(request, current_user.id, session) |
|
|
|
return StreamingResponse( |
|
execute_flow_with_validation_streaming( |
|
flow_filename=LANGFLOW_ASSISTANT_FLOW, |
|
input_value=request.input_value or "", |
|
global_variables=ctx.global_vars, |
|
max_retries=ctx.max_retries, |
|
user_id=str(current_user.id), |
|
session_id=ctx.session_id, |
|
provider=ctx.provider, |
|
model_name=ctx.model_name, |
|
api_key_var=ctx.api_key_name, |
|
), |
|
media_type="text/event-stream", |
The request model accepts attacker-influenceable fields such as input_value, flow_id, provider, model_name, session_id, and max_retries:
|
class AssistantRequest(BaseModel): |
|
"""Request model for assistant interactions.""" |
|
|
|
flow_id: str |
|
component_id: str | None = None |
|
field_name: str | None = None |
|
input_value: str | None = None |
|
max_retries: int | None = None |
|
model_name: str | None = None |
|
provider: str | None = None |
|
session_id: str | None = None |
|
|
2.2 Root Cause
In the affected code path, Langflow processes model output through the following chain:
/assist
→ execute_flow_with_validation()
→ execute_flow_file()
→ LLM returns component code
→ extract_component_code()
→ validate_component_code()
→ create_class()
→ generated class is instantiated
The assistant service reaches the validation path here:
|
result = await execute_flow_file( |
|
flow_filename=flow_filename, |
|
input_value=current_input, |
|
global_variables=global_variables, |
|
verbose=True, |
|
user_id=user_id, |
|
session_id=session_id, |
|
provider=provider, |
|
model_name=model_name, |
|
api_key_var=api_key_var, |
|
) |
|
|
|
response_text = extract_response_text(result) |
|
code = extract_component_code(response_text) |
|
|
|
if not code: |
|
logger.debug("No Python code found in response, returning as-is") |
|
return result |
|
|
|
logger.info("Validating generated component code...") |
|
validation = validate_component_code(code) |
|
|
The code extraction step occurs here:
|
def extract_python_code(text: str) -> str | None: |
|
"""Extract Python code from markdown code blocks. |
|
|
|
Handles both closed (```python ... ```) and unclosed blocks. |
|
Returns the first code block that appears to be a Langflow component. |
|
""" |
|
matches = _find_code_blocks(text) |
|
if not matches: |
|
return None |
|
|
|
return _find_component_code(matches) or matches[0].strip() |
|
|
|
|
|
def _find_code_blocks(text: str) -> list[str]: |
|
"""Find all code blocks in text, handling both closed and unclosed blocks.""" |
|
matches = re.findall(PYTHON_CODE_BLOCK_PATTERN, text, re.IGNORECASE) |
|
if matches: |
|
return matches |
|
|
|
matches = re.findall(GENERIC_CODE_BLOCK_PATTERN, text) |
|
if matches: |
|
return matches |
|
|
|
return _find_unclosed_code_block(text) |
|
|
|
|
|
def _find_unclosed_code_block(text: str) -> list[str]: |
|
"""Handle LLM responses that don't close the code block with ```.""" |
|
for pattern in [UNCLOSED_PYTHON_BLOCK_PATTERN, UNCLOSED_GENERIC_BLOCK_PATTERN]: |
|
match = re.search(pattern, text, re.IGNORECASE) |
|
if match: |
|
code = match.group(1).rstrip("`").strip() |
|
return [code] if code else [] |
|
|
|
return [] |
|
|
|
|
|
def _find_component_code(matches: list[str]) -> str | None: |
|
"""Find the first match that looks like a Langflow component.""" |
|
for match in matches: |
|
if "class " in match and "Component" in match: |
|
return match.strip() |
|
return None |
The validation entry point is here:
|
def validate_component_code(code: str) -> ValidationResult: |
|
"""Validate component code by attempting to create and instantiate the class. |
|
|
|
This instantiates the class to trigger __init__ validation checks, |
|
such as overlapping input/output names. |
|
""" |
|
class_name = _safe_extract_class_name(code) |
|
|
|
try: |
|
if class_name is None: |
|
msg = "Could not extract class name from code" |
|
raise ValueError(msg) |
|
|
|
# create_class returns the class (not an instance) |
|
component_class = create_class(code, class_name) |
|
|
|
# Instantiate the class to trigger __init__ validation |
|
# This catches errors like overlapping input/output names |
|
component_class() |
|
|
|
return ValidationResult(is_valid=True, code=code, class_name=class_name) |
The issue is that this validation path is not purely static. It ultimately invokes create_class() in lfx.custom.validate, where Python code is dynamically executed via exec(...), including both global-scope preparation and class construction.
|
def create_class(code, class_name): |
|
"""Dynamically create a class from a string of code and a specified class name. |
|
|
|
Args: |
|
code: String containing the Python code defining the class |
|
class_name: Name of the class to be created |
|
|
|
Returns: |
|
A function that, when called, returns an instance of the created class |
|
|
|
Raises: |
|
ValueError: If the code contains syntax errors or the class definition is invalid |
|
""" |
|
if not hasattr(ast, "TypeIgnore"): |
|
ast.TypeIgnore = create_type_ignore_class() |
|
|
|
code = code.replace("from langflow import CustomComponent", "from langflow.custom import CustomComponent") |
|
code = code.replace( |
|
"from langflow.interface.custom.custom_component import CustomComponent", |
|
"from langflow.custom import CustomComponent", |
|
) |
|
|
|
code = DEFAULT_IMPORT_STRING + "\n" + code |
|
try: |
|
module = ast.parse(code) |
|
exec_globals = prepare_global_scope(module) |
|
|
|
class_code = extract_class_code(module, class_name) |
|
compiled_class = compile_class_code(class_code) |
|
|
|
return build_class_constructor(compiled_class, exec_globals, class_name) |
|
|
|
if definitions: |
|
combined_module = ast.Module(body=definitions, type_ignores=[]) |
|
compiled_code = compile(combined_module, "<string>", "exec") |
|
exec(compiled_code, exec_globals) |
|
|
|
return exec_globals |
|
exec_locals = dict(locals()) |
|
exec(compiled_class, exec_globals, exec_locals) |
|
exec_globals[class_name] = exec_locals[class_name] |
As a result, LLM-generated code is treated as executable Python rather than inert data. This means the “validation” step crosses a trust boundary and becomes an execution sink.
The streaming path can also reach this sink when the request is classified into the component-generation branch:
|
# Classify intent using LLM (handles multi-language support) |
|
# This translates the input and determines if user wants to generate a component or ask a question |
|
intent_result = await classify_intent( |
|
text=input_value, |
|
global_variables=global_variables, |
|
user_id=user_id, |
|
session_id=session_id, |
|
provider=provider, |
|
model_name=model_name, |
|
api_key_var=api_key_var, |
|
) |
|
|
|
# Check if this is a component generation request based on LLM classification |
|
is_component_request = intent_result.intent == "generate_component" |
|
logger.info(f"Intent classification: {intent_result.intent} (is_component_request={is_component_request})") |
|
# Only extract and validate code for component generation requests |
|
response_text = extract_response_text(result) |
|
code = extract_component_code(response_text) |
|
|
|
if not code: |
|
# No code found even though user asked for component generation |
|
# Return as plain text response |
|
yield format_complete_event(result) |
|
return |
|
|
|
# Check for cancellation before extraction |
|
if await check_cancelled(): |
|
logger.info("Client disconnected before code extraction, cancelling") |
|
yield format_cancelled_event() |
|
return |
|
|
|
# Step 3: Extracting code (only shown when code is found) |
|
yield format_progress_event( |
|
"extracting_code", |
|
attempt, |
|
max_retries, |
|
message="Extracting Python code from response...", |
|
) |
|
await asyncio.sleep(VALIDATION_UI_DELAY_SECONDS) |
|
|
|
# Check for cancellation before validation |
|
if await check_cancelled(): |
|
logger.info("Client disconnected before validation, cancelling") |
|
yield format_cancelled_event() |
|
return |
|
|
|
# Step 4: Validating |
|
yield format_progress_event( |
|
"validating", |
|
attempt, |
|
max_retries, |
|
message="Validating component code...", |
|
) |
|
await asyncio.sleep(VALIDATION_UI_DELAY_SECONDS) |
|
|
|
validation = validate_component_code(code) |
|
|
3. Proof of Concept (PoC)
- Send a request to the Agentic Assistant endpoint.
- Provide input that causes the model to return malicious component code.
- The returned code reaches the validation path.
- During validation, the server dynamically executes the generated Python.
- Arbitrary server-side code execution occurs.
4. Impact
5. Exploitability Notes
This issue is most accurately described as an authenticated or feature-reachable code execution vulnerability, rather than an unconditional unauthenticated remote attack.
Severity depends on deployment model:
- In local-only, single-user development setups, the issue may be limited to self-exposure by the operator.
- In shared, team, or internet-exposed deployments, it may be exploitable by other users or attackers who can reach the assistant feature.
The assistant feature depends on an active user context:
|
CurrentActiveUser = Annotated[User, Depends(get_current_active_user)] |
Authentication sources include bearer token, cookie, or API key:
|
async def __call__(self, request: Request) -> str | None: |
|
# First, check for explicit Authorization header (for backward compatibility and testing) |
|
authorization = request.headers.get("Authorization") |
|
scheme, param = get_authorization_scheme_param(authorization) |
|
if scheme.lower() == "bearer" and param: |
|
return param |
|
|
|
# Fall back to cookie (for HttpOnly cookie support in browser-based clients) |
|
token = request.cookies.get("access_token_lf") |
|
if token: |
|
return token |
|
|
|
# If auto_error is True, this would raise an exception |
|
# Since we set auto_error=False, return None |
|
return None |
|
async def get_current_user( |
|
token: Annotated[str | None, Security(oauth2_login)], |
|
query_param: Annotated[str | None, Security(api_key_query)], |
|
header_param: Annotated[str | None, Security(api_key_header)], |
|
db: AsyncSession = Depends(injectable_session_scope), |
|
) -> User: |
|
try: |
|
return await _auth_service().get_current_user(token, query_param, header_param, db) |
Default deployment settings may widen exposure, including AUTO_LOGIN=true and the /api/v1/auto_login endpoint:
|
AUTO_LOGIN: bool = Field( |
|
default=True, # TODO: Set to False in v2.0 |
|
description=( |
|
"Enable automatic login with default credentials. " |
|
"SECURITY WARNING: This bypasses authentication and should only be used in development environments. " |
|
"Set to False in production. This will default to False in v2.0." |
|
), |
|
) |
|
"""If True, the application will attempt to log in automatically as a super user.""" |
|
skip_auth_auto_login: bool = False |
|
"""If True, the application will skip authentication when AUTO_LOGIN is enabled. |
|
This will be removed in v2.0""" |
|
|
|
WEBHOOK_AUTH_ENABLE: bool = False |
|
"""If True, webhook endpoints will require API key authentication. |
|
If False, webhooks run as flow owner without authentication.""" |
|
|
|
@router.get("/auto_login", include_in_schema=False) |
|
async def auto_login(response: Response, db: DbSession): |
|
auth_settings = get_settings_service().auth_settings |
|
|
|
if auth_settings.AUTO_LOGIN: |
|
auth = get_auth_service() |
|
user_id, tokens = await auth.create_user_longterm_token(db) |
|
response.set_cookie( |
|
"access_token_lf", |
|
tokens["access_token"], |
|
httponly=auth_settings.ACCESS_HTTPONLY, |
|
samesite=auth_settings.ACCESS_SAME_SITE, |
|
secure=auth_settings.ACCESS_SECURE, |
|
expires=None, # Set to None to make it a session cookie |
|
domain=auth_settings.COOKIE_DOMAIN, |
|
) |
|
|
|
user = await get_user_by_id(db, user_id) |
|
|
|
if user: |
|
if user.store_api_key is None: |
|
user.store_api_key = "" |
|
|
|
response.set_cookie( |
|
"apikey_tkn_lflw", |
|
str(user.store_api_key), # Ensure it's a string |
|
httponly=auth_settings.ACCESS_HTTPONLY, |
|
samesite=auth_settings.ACCESS_SAME_SITE, |
|
secure=auth_settings.ACCESS_SECURE, |
|
expires=None, # Set to None to make it a session cookie |
|
domain=auth_settings.COOKIE_DOMAIN, |
|
) |
|
|
|
if get_settings_service().settings.agentic_experience: |
|
from langflow.api.utils.mcp.agentic_mcp import initialize_agentic_user_variables |
|
|
|
await initialize_agentic_user_variables(user.id, db) |
|
|
|
return tokens |
|
|
6. Patch Recommendation
- Remove all dynamic execution from the validation path.
- Ensure validation is strictly static and side-effect-free.
- Treat all LLM output as untrusted input.
- If code generation must be supported, require explicit approval and run it in a hardened sandbox isolated from the main server process.
Discovered by: @kexinoh (https://github.com/kexinoh, works at Tencent Zhuque Lab)
Description
1. Summary
The Agentic Assistant feature in Langflow executes LLM-generated Python code during its validation phase. Although this phase appears intended to validate generated component code, the implementation reaches dynamic execution sinks and instantiates the generated class server-side.
In deployments where an attacker can access the Agentic Assistant feature and influence the model output, this can result in arbitrary server-side Python execution.
2. Description
2.1 Intended Functionality
The Agentic Assistant endpoints are designed to help users generate and validate components for a flow. Users can submit requests to the assistant, which returns candidate component code for further processing.
A reasonable security expectation is that validation should treat model output as untrusted text and perform only static or side-effect-free checks.
The externally reachable endpoints are:
langflow/src/backend/base/langflow/agentic/api/router.py
Lines 252 to 297 in f7f4d1e
The request model accepts attacker-influenceable fields such as
input_value,flow_id,provider,model_name,session_id, andmax_retries:langflow/src/backend/base/langflow/agentic/api/schemas.py
Lines 20 to 31 in f7f4d1e
2.2 Root Cause
In the affected code path, Langflow processes model output through the following chain:
/assist→
execute_flow_with_validation()→
execute_flow_file()→ LLM returns component code
→
extract_component_code()→
validate_component_code()→
create_class()→ generated class is instantiated
The assistant service reaches the validation path here:
langflow/src/backend/base/langflow/agentic/services/assistant_service.py
Lines 58 to 79 in f7f4d1e
The code extraction step occurs here:
langflow/src/backend/base/langflow/agentic/helpers/code_extraction.py
Lines 11 to 53 in f7f4d1e
The validation entry point is here:
langflow/src/backend/base/langflow/agentic/helpers/validation.py
Lines 27 to 47 in f7f4d1e
The issue is that this validation path is not purely static. It ultimately invokes
create_class()inlfx.custom.validate, where Python code is dynamically executed viaexec(...), including both global-scope preparation and class construction.langflow/src/lfx/src/lfx/custom/validate.py
Lines 241 to 272 in f7f4d1e
langflow/src/lfx/src/lfx/custom/validate.py
Lines 394 to 399 in f7f4d1e
langflow/src/lfx/src/lfx/custom/validate.py
Lines 441 to 443 in f7f4d1e
As a result, LLM-generated code is treated as executable Python rather than inert data. This means the “validation” step crosses a trust boundary and becomes an execution sink.
The streaming path can also reach this sink when the request is classified into the component-generation branch:
langflow/src/backend/base/langflow/agentic/services/assistant_service.py
Lines 142 to 156 in f7f4d1e
langflow/src/backend/base/langflow/agentic/services/assistant_service.py
Lines 259 to 300 in f7f4d1e
3. Proof of Concept (PoC)
4. Impact
Attackers who can access the Agentic Assistant feature and influence model output may execute arbitrary Python code on the server.
This can lead to:
5. Exploitability Notes
This issue is most accurately described as an authenticated or feature-reachable code execution vulnerability, rather than an unconditional unauthenticated remote attack.
Severity depends on deployment model:
The assistant feature depends on an active user context:
langflow/src/backend/base/langflow/api/utils/core.py
Line 38 in f7f4d1e
Authentication sources include bearer token, cookie, or API key:
langflow/src/backend/base/langflow/services/auth/utils.py
Lines 39 to 53 in f7f4d1e
langflow/src/backend/base/langflow/services/auth/utils.py
Lines 156 to 163 in f7f4d1e
Default deployment settings may widen exposure, including
AUTO_LOGIN=trueand the/api/v1/auto_loginendpoint:langflow/src/lfx/src/lfx/services/settings/auth.py
Lines 71 to 87 in f7f4d1e
langflow/src/backend/base/langflow/api/v1/login.py
Lines 96 to 135 in f7f4d1e
6. Patch Recommendation
Discovered by: @kexinoh (https://github.com/kexinoh, works at Tencent Zhuque Lab)