Skip to content

Issue: security concerns with exec() via multiple agents and Shell tool #5294

@juppytt

Description

@juppytt

Issue you'd like to raise.

TL;DR: The use of exec() in agents can lead to remote code execution vulnerabilities. Some Huggingface projects use such agents, despite the potential harm of LLM-generated Python code.

#1026 and #814 discuss the security concerns regarding the use of exec() in llm_math chain. The comments in #1026 proposed methods to sandbox the code execution, but due to environmental issues, the code was patched to replace exec() with numexpr.evaluate() (#2943). This restricted the execution capabilities to mathematical functionalities only. This bug was assigned the CVE number CVE-2023-29374.

As shown in the above issues, the usage of exec() in a chain can pose a significant security risk, especially when the chain is running on a remote machine. This seems common scenario for projects in Huggingface.

However, in the latest langchain, exec() is still used in PythonReplTool and PythonAstReplTool.
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L55

https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L102

These functions are called by Pandas Dataframe Agent, Spark Dataframe Agent, CSV Agent. It seems they are intentionally designed to pass the LLM output to PythonAstTool or PythonAstReplTool to execute the LLM-generated code in the machine.

The documentation for these agents explicitly states that they should be used with caution since LLM-generated Python code can be potentially harmful. For instance:
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/docs/modules/agents/toolkits/examples/pandas.ipynb#L12

Despite this, I have observed several projects in Huggingface using create_pandas_dataframe_agent and create_csv_agent.

Suggestion:

Fixing this issue as done in llm_math chain seems challenging.
Simply restricting the LLM-generated code to Pandas and Spark execution might not be sufficient because there are still numerous malicious tasks that can be performed using those APIs. For instance, Pandas can read and write files.

Meanwhile, it seems crucial to emphasize the security concerns related to LLM-generated code for the overall security of LLM apps. Merely limiting execution to specific frameworks or APIs may not fully address the underlying security risks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions