Issue: security concerns with `exec()` via multiple agents and Shell tool

### Issue you'd like to raise.


TL;DR: The use of exec() in agents can lead to remote code execution vulnerabilities. Some Huggingface projects use such agents, despite the potential harm of LLM-generated Python code.


#1026 and #814 discuss the security concerns regarding the use of `exec()` in llm_math chain.  The comments in #1026 proposed methods to sandbox the code execution, but due to environmental issues, the code was patched to replace `exec()` with `numexpr.evaluate()` (#2943). This restricted the execution capabilities to mathematical functionalities only. This bug was assigned the CVE number CVE-2023-29374. 

As shown in the above issues, the usage of `exec()` in a chain can pose a significant security risk, especially when the chain is running on a remote machine. This seems common scenario for projects in Huggingface. 

However, in the latest langchain, `exec()` is still used in `PythonReplTool` and `PythonAstReplTool`. 
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L55

https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/langchain/tools/python/tool.py#L102

These functions are called by Pandas Dataframe Agent, Spark Dataframe Agent, CSV Agent. It seems they are intentionally designed to pass the LLM output to `PythonAstTool` or `PythonAstReplTool` to execute the LLM-generated code in the machine.

The documentation for these agents explicitly states that they should be used with caution since LLM-generated Python code can be potentially harmful. For instance:
https://github.com/hwchase17/langchain/blob/aec642febb3daa7dbb6a19996aac2efa92bbf1bd/docs/modules/agents/toolkits/examples/pandas.ipynb#L12

Despite this, I have observed several projects in Huggingface using `create_pandas_dataframe_agent` and `create_csv_agent`.





### Suggestion:


Fixing this issue as done in llm_math chain seems challenging.
Simply restricting the LLM-generated code to Pandas and Spark execution might not be sufficient because there are still numerous malicious tasks that can be performed using those APIs. For instance, Pandas can read and write files.

Meanwhile, it seems crucial to emphasize the security concerns related to LLM-generated code for the overall security of LLM apps. Merely limiting execution to specific frameworks or APIs may not fully address the underlying security risks. 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294

Issue you'd like to raise.

Suggestion:

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue: security concerns with exec() via multiple agents and Shell tool #5294

Description

Issue you'd like to raise.

Suggestion:

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Issue: security concerns with `exec()` via multiple agents and Shell tool #5294