GenAI Engine

The GenAI Engine (formerly known as Arthur Shield) is a tool for evaluating and benchmarking large language models (LLMs) and generative AI workflows. It allows users to measure and monitor response relevance, hallucination rates, token counts, latency, and more. The engine also provides a simple way to add guardrails to your LLM applications and generative AI workflows. It has configurable metrics for real-time detection of PII or Sensitive Data leakage, Hallucination, Prompt Injection attempts, Toxic language, and other quality metrics. Arthur Engine can prevent these risks from causing bad user experience in production and negatively impacting your organization's reputation.

GenAI Engine

Getting Started

There are a several ways to run the GenAI Engine:

Docker Compose
Cloudformation for AWS deployment with Elastic Container Service (ECS)
Helm Chart for Kubernetes

Note: The GenAI Engine is currently limited to providing you with the guardrail features. The rest of the features are coming soon!

Quickstart

Follow the Docker Compose instructions to deploy the engine on your local machine
Once your genai-engine is up and running, navigate to its interactive API documentation at /docs via a browser
Create an API key by referring to the API Authentication Guide. Your admin key is the GENAI_ENGINE_ADMIN_KEY in the docker-compose.yml file. In the Docker Compose deployment, the admin key is also enabled to interact with all the API endpoints to quickly get started with exploring the capability.
Provide /docs the access to use the API endpoints by entering your new API key, via the "Authorize" button, located at the top right of the page
Create a new task (use case/LLM application) by expanding the POST /api/v2/task endpoint on the /docs page. Click on "Try it out", provide a task name, and click "Execute".
Configure evaluation rules in the newly created task with the POST /api/v2/tasks/{task_id}/rules endpoint
Run LLM prompt and generated response evaluations by using the "Task Based Validation" endpoints. For the response validation endpoint, "context" must be provided for the hallucination rule if it's enabled. Hallucinations are generated responses characterized as incorrect or unfaithful responses given a user input and source knowledge (context). The context is often the Retrieval-Augmented Generation(RAG) data from your LLM application.
Try the default rules, which are global rules that are automatically applied to every task

For more information, refer to the User Guide.

Usage Examples

Documentation

User Guide
API Documentation - (/docs on your GenAI Engine instance)

Developer Setup (for Mac)

Install the Python dependencies with Poetry

Git clone the repo
Install Poetry: Poetry is a Python dependency management framework. pyproject.toml is the descriptor.
```
pip install poetry
```

Set the proper Python version: Currently developed and tested with 3.12.8

cd genai-engine

poetry self add poetry-plugin-shell
poetry shell && poetry env use 3.12

Install dependencies/packages
```
poetry install
```
To add (or upgrade) a dependency, use the following command:
```
poetry add <package_name>==<package_version>
```
To add (or upgrade) a dev dependency, use the following command:
```
poetry add --group dev <package_name>==<package_version>
```

Run Postgres

A Postgres database is required to run the GenAI Engine. The easiest way to get started is to run Postgres using Docker.

Install and run Docker for Mac
cd to the genai-engine folder
Run docker compose up
Login with postgres/changeme_pg_password

Populate the Database Schema with Alembic

The Alembic database migration tool needs to be run the first time and every time a new database schema change is added. Maks sure the Poetry install is complete and you have a running Postgres instance first.

cd to /genai-engine and run the commands below:

export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=changeme_pg_password
export POSTGRES_URL=localhost
export POSTGRES_PORT=5435
export POSTGRES_DB=arthur_genai_engine
export POSTGRES_USE_SSL=false
export PYTHONPATH="src:$PYTHONPATH"

poetry run alembic upgrade head

Run the app with an IDE (Visual Studio Code / Cursor example)

Install the IDE
Install the recommended extensions
- Python
- Docker
- CloudFormation
- Kubernetes
- Markdown All in One
Open a new window and select the genai-engine folder
Find the path to the interpreter used by the Poetry environment
```
poetry env info --path
```
Open a Python file (e.g. src/server.py) and make sure you have the Python interpreter looked up in the previous step selected

Create a new launch configuration: Run -> Add Configurations -> Python Debugger -> Python File . Add the below configuration and adjust the values according to your environment. Please reference the .env file.

    {
        "name": "GenAI Engine",
        "type": "python",
        "request": "launch",
        "module": "uvicorn",
        "args": [
            "src.server:get_app",
            "--reload"
        ],
        "jinja": true,
        "justMyCode": false,
        "env": {
            "PYTHONPATH": "src",

            "POSTGRES_USER": "postgres",
            "POSTGRES_PASSWORD": "changeme_pg_password",
            "POSTGRES_URL": "localhost",
            "POSTGRES_PORT": "5435",
            "POSTGRES_DB": "arthur_genai_engine",
            "POSTGRES_USE_SSL": "false",
            "GENAI_ENGINE_ENABLE_PERSISTENCE": "enabled",

            "GENAI_ENGINE_ENVIRONMENT":"local",
            "GENAI_ENGINE_ADMIN_KEY": "changeme123",
            "GENAI_ENGINE_INGRESS_URI": "http://localhost:3030",

            "GENAI_ENGINE_OPENAI_PROVIDER": "Azure",
            "GENAI_ENGINE_OPENAI_GPT_NAMES_ENDPOINTS_KEYS": "model_name::https://my_service.openai.azure.com/::my_api_key"
        }
    }

Run -> Run Without Debugging / Start Debugging
Open http://localhost:3030/docs in your web browser and start building!

Run the app via the terminal

Load a dedicated Python environment with a compatible Python version (i.e. 3.12)
Install the Python dependencies with Poetry

Set the following environment variables:

export POSTGRES_USER=postgres
export POSTGRES_PASSWORD=changeme_pg_password
export POSTGRES_URL=localhost
export POSTGRES_PORT=5435
export POSTGRES_DB=arthur_genai_engine
export POSTGRES_USE_SSL=false
export GENAI_ENGINE_ENABLE_PERSISTENCE=enabled

export GENAI_ENGINE_ENVIRONMENT=local
export GENAI_ENGINE_ADMIN_KEY=changeme123
export GENAI_ENGINE_INGRESS_URI=http://localhost:3030

export GENAI_ENGINE_OPENAI_PROVIDER=Azure
export OPENAI_API_VERSION=2023-07-01-preview
export GENAI_ENGINE_OPENAI_GPT_NAMES_ENDPOINTS_KEYS=model_name::https://my_service.openai.azure.com/::my_api_key

Run the server

export PYTHONPATH="src:$PYTHONPATH"
poetry run serve

Making your first commit

The git pre-commit hooks

Review the CONTRIBUTE.MD document carefully. Make sure the git pre-commit hooks are installed properly.

Pytest

As part of the pre-commit hook, Pytest unit tests are executed. You can disable it with following command when making a commit that's not ready for testing:

SKIP=genai-engine-pytest-check git commit -m "<your message>"

Security check for endpoints

The pre-commit hook also runs a check to make sure that all endpoints have been evaluated for access control using the below script.

poetry run python routes_security_check.py

Script accepts the following arguments:

--log-level: Set the logging level. The default is INFO.
--short: Print only the summary
--files-summary: Print the summary of each file

Unit Tests

Run the unit tests with the following command:

poetry run pytest -m "unit_tests"

Run the unit tests with coverage:

poetry run pytest -m "unit_tests" --cov=src --cov-fail-under=79

Integration Tests

Make sure you have a running instance of genai-engine on your local machine

Set the below envars

export REMOTE_TEST_URL=http://localhost:3030
export REMOTE_TEST_KEY=changeme123

Run the below shell script from the genai-engine directory
```
./tests/test_remote.sh
```

Performance Tests

For running performance tests, we use Locust.

Follow the steps below to run performance tests:

Install Locust
```
poetry install --only performance
```
Run performance tests by referring to the Locust README

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GenAI Engine

Table of Contents

Getting Started

Quickstart

Usage Examples

Documentation

Developer Setup (for Mac)

Install the Python dependencies with Poetry

Run Postgres

Populate the Database Schema with Alembic

Run the app with an IDE (Visual Studio Code / Cursor example)

Run the app via the terminal

Making your first commit

The git pre-commit hooks

Pytest

Security check for endpoints

Unit Tests

Integration Tests

Performance Tests

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

GenAI Engine

Table of Contents

Getting Started

Quickstart

Usage Examples

Documentation

Developer Setup (for Mac)

Install the Python dependencies with Poetry

Run Postgres

Populate the Database Schema with Alembic

Run the app with an IDE (Visual Studio Code / Cursor example)

Run the app via the terminal

Making your first commit

The git pre-commit hooks

Pytest

Security check for endpoints

Unit Tests

Integration Tests

Performance Tests