The GenAI Engine (formerly known as Arthur Shield) is a tool for evaluating and benchmarking large language models (LLMs) and generative AI workflows. It allows users to measure and monitor response relevance, hallucination rates, token counts, latency, and more. The engine also provides a simple way to add guardrails to your LLM applications and generative AI workflows. It has configurable metrics for real-time detection of PII or Sensitive Data leakage, Hallucination, Prompt Injection attempts, Toxic language, and other quality metrics. Arthur Engine can prevent these risks from causing bad user experience in production and negatively impacting your organization's reputation.
- GenAI Engine
There are a several ways to run the GenAI Engine:
- Docker Compose
- Cloudformation for AWS deployment with Elastic Container Service (ECS)
- Helm Chart for Kubernetes
Note: The GenAI Engine is currently limited to providing you with the guardrail features. The rest of the features are coming soon!
- Follow the Docker Compose instructions to deploy the engine on your local machine
- Once your
genai-engineis up and running, navigate to its interactive API documentation at/docsvia a browser - Create an API key by referring to the API Authentication Guide. Your admin key is the
GENAI_ENGINE_ADMIN_KEYin the docker-compose.yml file. In the Docker Compose deployment, the admin key is also enabled to interact with all the API endpoints to quickly get started with exploring the capability. - Provide
/docsthe access to use the API endpoints by entering your new API key, via the "Authorize" button, located at the top right of the page - Create a new task (use case/LLM application) by expanding the
POST /api/v2/taskendpoint on the/docspage. Click on "Try it out", provide a task name, and click "Execute". - Configure evaluation rules in the newly created task with the
POST /api/v2/tasks/{task_id}/rulesendpoint - Run LLM prompt and generated response evaluations by using the "Task Based Validation" endpoints. For the response validation endpoint, "context" must be provided for the hallucination rule if it's enabled. Hallucinations are generated responses characterized as incorrect or unfaithful responses given a user input and source knowledge (context). The context is often the Retrieval-Augmented Generation(RAG) data from your LLM application.
- Try the default rules, which are global rules that are automatically applied to every task
For more information, refer to the User Guide.
- GenAI Engine client example notebooks
- An example of protecting an Agentic Application with GenAI Engine
- User Guide
- API Documentation - (
/docson your GenAI Engine instance)
- Git clone the repo
- Install Poetry: Poetry is a Python dependency management framework.
pyproject.tomlis the descriptor.pip install poetry
- Set the proper Python version: Currently developed and tested with
3.12.8cd genai-engine poetry self add poetry-plugin-shell poetry shell && poetry env use 3.12
- Install dependencies/packages
To add (or upgrade) a dependency, use the following command:
poetry install
To add (or upgrade) a dev dependency, use the following command:poetry add <package_name>==<package_version>
poetry add --group dev <package_name>==<package_version>
A Postgres database is required to run the GenAI Engine. The easiest way to get started is to run Postgres using Docker.
- Install and run Docker for Mac
cdto thegenai-enginefolder- Run
docker compose up - Login with
postgres/changeme_pg_password
The Alembic database migration tool needs to be run the first time and every time a new database schema change is added. Maks sure the Poetry install is complete and you have a running Postgres instance first.
cd to /genai-engine and run the commands below:
export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=changeme_pg_password
export POSTGRES_URL=localhost
export POSTGRES_PORT=5435
export POSTGRES_DB=arthur_genai_engine
export POSTGRES_USE_SSL=false
export PYTHONPATH="src:$PYTHONPATH"
poetry run alembic upgrade head- Install the IDE
- Install the recommended extensions
- Python
- Docker
- CloudFormation
- Kubernetes
- Markdown All in One
- Open a new window and select the
genai-enginefolder - Find the path to the interpreter used by the Poetry environment
poetry env info --path
- Open a Python file (e.g.
src/server.py) and make sure you have the Python interpreter looked up in the previous step selected - Create a new launch configuration:
Run->Add Configurations->Python Debugger->Python File. Add the below configuration and adjust the values according to your environment. Please reference the.envfile.{ "name": "GenAI Engine", "type": "python", "request": "launch", "module": "uvicorn", "args": [ "src.server:get_app", "--reload" ], "jinja": true, "justMyCode": false, "env": { "PYTHONPATH": "src", "POSTGRES_USER": "postgres", "POSTGRES_PASSWORD": "changeme_pg_password", "POSTGRES_URL": "localhost", "POSTGRES_PORT": "5435", "POSTGRES_DB": "arthur_genai_engine", "POSTGRES_USE_SSL": "false", "GENAI_ENGINE_ENABLE_PERSISTENCE": "enabled", "GENAI_ENGINE_ENVIRONMENT":"local", "GENAI_ENGINE_ADMIN_KEY": "changeme123", "GENAI_ENGINE_INGRESS_URI": "http://localhost:3030", "GENAI_ENGINE_OPENAI_PROVIDER": "Azure", "GENAI_ENGINE_OPENAI_GPT_NAMES_ENDPOINTS_KEYS": "model_name::https://my_service.openai.azure.com/::my_api_key" } } Run->Run Without Debugging/Start Debugging- Open
http://localhost:3030/docsin your web browser and start building!
- Load a dedicated Python environment with a compatible Python version (i.e.
3.12) - Install the Python dependencies with Poetry
- Set the following environment variables:
export POSTGRES_USER=postgres export POSTGRES_PASSWORD=changeme_pg_password export POSTGRES_URL=localhost export POSTGRES_PORT=5435 export POSTGRES_DB=arthur_genai_engine export POSTGRES_USE_SSL=false export GENAI_ENGINE_ENABLE_PERSISTENCE=enabled export GENAI_ENGINE_ENVIRONMENT=local export GENAI_ENGINE_ADMIN_KEY=changeme123 export GENAI_ENGINE_INGRESS_URI=http://localhost:3030 export GENAI_ENGINE_OPENAI_PROVIDER=Azure export OPENAI_API_VERSION=2023-07-01-preview export GENAI_ENGINE_OPENAI_GPT_NAMES_ENDPOINTS_KEYS=model_name::https://my_service.openai.azure.com/::my_api_key - Run the server
export PYTHONPATH="src:$PYTHONPATH" poetry run serve
Review the CONTRIBUTE.MD document carefully. Make sure the git pre-commit hooks are installed properly.
As part of the pre-commit hook, Pytest unit tests are executed. You can disable it with following command when making a commit that's not ready for testing:
SKIP=genai-engine-pytest-check git commit -m "<your message>"The pre-commit hook also runs a check to make sure that all endpoints have been evaluated for access control using the below script.
poetry run python routes_security_check.pyScript accepts the following arguments:
--log-level: Set the logging level. The default isINFO.--short: Print only the summary--files-summary: Print the summary of each file
Run the unit tests with the following command:
poetry run pytest -m "unit_tests"Run the unit tests with coverage:
poetry run pytest -m "unit_tests" --cov=src --cov-fail-under=79- Make sure you have a running instance of genai-engine on your local machine
- Set the below envars
export REMOTE_TEST_URL=http://localhost:3030 export REMOTE_TEST_KEY=changeme123
- Run the below shell script from the
genai-enginedirectory./tests/test_remote.sh
For running performance tests, we use Locust.
Follow the steps below to run performance tests:
- Install Locust
poetry install --only performance
- Run performance tests by referring to the Locust README
