A small example project demonstrating DeepEval, an open-source LLM evaluation framework for Python.
- Running built-in metrics (answer relevancy, faithfulness, hallucination) against LLM outputs
- Writing evaluation cases with
LLMTestCase - Running evaluations as
pytest-style tests
-
Create and activate a virtual environment:
python -m venv .venv .\.venv\Scripts\Activate.ps1 -
Install dependencies:
pip install -r requirements.txt -
Copy
.env.exampleto.envand add your OpenAI API key (DeepEval uses an LLM judge by default):Copy-Item .env.example .env
deepeval test run tests/test_example.pyexample.py— minimal standalone evaluation usingevaluate()test_example.py— pytest-style evaluation usingassert_testrequirements.txt— pinned dependencies.env.example— template for required environment variables