Skip to content

Latest commit

 

History

History
299 lines (187 loc) · 16.7 KB

File metadata and controls

299 lines (187 loc) · 16.7 KB

Deployment Guide

Prerequisites

If you do not have an Azure Subscription, you can sign up for a free Azure account and create an Azure Subscription.

To deploy this Azure environment successfully, your Azure account (the account you authenticate with) must have the following permissions and prerequisites on the targeted Azure Subscription:

You can view the permissions for your account and subscription by going to Azure portal, clicking 'Subscriptions' under 'Navigation' and then choosing your subscription from the list. If cannot find the subscription, make sure no filters are selected. After selecting your subscription, select 'Access control (IAM)' and you can see the roles that are assigned to your account for this subscription. To get more information about the roles, go to the 'Role assignments' tab, search by your account name and click the role you want to view more information about.

Check the Azure Products by Region page and select a region where the following services are available:

Here are some examples of the regions where the services are available: East US, East US2, Japan East, UK South, Sweden Central.

Important Note for PowerShell Users

If you encounter issues running PowerShell scripts due to the policy of not being digitally signed, you can temporarily adjust the ExecutionPolicy by running the following command in an elevated PowerShell session:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass

This will allow the scripts to run for the current session without permanently changing your system's policy.

Deployment Options & Steps

Deployment Steps

Pick from the options below to see step-by-step instructions for GitHub Codespaces, VS Code Dev Containers, Local Environments, and Bicep deployments.

Open in GitHub Codespaces Open in Dev Containers
Deploy in GitHub Codespaces

GitHub Codespaces

You can run this template virtually by using GitHub Codespaces. The button will open a web-based VS Code instance in your browser:

  1. Open the template (this may take several minutes):

    Open in GitHub Codespaces

  2. Open a terminal window

  3. Continue with the deploying steps

Deploy in VS Code

VS Code Dev Containers

A related option is VS Code Dev Containers, which will open the project in your local VS Code using the Dev Containers extension:

  1. Start Docker Desktop (install it if not already installed Docker Desktop)

  2. Open the project:

    Open in Dev Containers

  3. In the VS Code window that opens, once the project files show up (this may take several minutes), open a terminal window.

  4. Continue with the deploying steps.

Deploy in your local Environment

Local Environment

If you're not using one of the above options for opening the project, then you'll need to:

  1. Make sure the following tools are installed:

    • Azure Developer CLI (azd) Install or update to the latest version. Instructions can be found on the linked page.
    • Python 3.9+
    • Git
    • [Windows Only] PowerShell of the latest version, needed only for local application development on Windows operation system. Please make sure that path to power shell executable pwsh.exe is added to the PATH variable.
  2. Clone the repository or download the project code via command-line:

    azd init -t get-started-with-ai-chat
  3. Open the project folder in your terminal or editor.

  4. Continue with the deploying steps.

Develop with Local Development Server

Develop with Local Development Server

You can optionally use a local development server to test app changes locally. Make sure you first deployed the app to Azure before running the development server.

  1. Create a Python virtual environment and activate it.

    On Windows:

    python -m venv .venv
    .venv\scripts\activate

    On Linux:

    python3 -m venv .venv
    source .venv/bin/activate
  2. Navigate to the src directory:

    cd src
  3. Install required Python packages:

    python -m pip install -r requirements.txt
  4. Install Node.js (v20 or later).

  5. Install pnpm

  6. Navigate to the frontend directory and setup for React UI:

    cd src/frontend
    pnpm run setup
  7. Fill in the environment variables in .env.

  8. (Optional) if you have changes in src/frontend, execute:

    pnpm build

    The build output will be placed in the ../api/static/react directory, where the backend can serve it.

  9. (Optional) If you have changes in gunicorn.conf.py, execute:

    python gunicorn.conf.py    
  10. Run the local server:

    python -m uvicorn "api.main:create_app" --factory --reload
  11. Click 'http://127.0.0.1:8000' in the terminal, which should open a new tab in the browser.

  12. Enter your message in the box.


Consider the following settings during deployment:

Configurable Deployment Settings

When you start a deployment, most parameters will have default values. You can change the following default settings:

Setting Description Default value
Existing Project Resource ID Specify an existing project resource ID to be used instead of provisioning new Microsoft Foundry project and Foundry Tools.
Azure Region Select a region with quota which supports your selected model.
Model Choose from the list of models for your selected region. gpt-4o-mini
Model Format Choose from OpenAI or Microsoft, depending on your model. OpenAI
Model Deployment Capacity Configure capacity for your model. 80k
Embedding Model Choose from text-embedding-3-large, text-embedding-3-small, and text-embedding-ada-002. This may only be deployed if Azure AI Search is enabled. text-embedding-3-small
Embedding Model Capacity Configure capacity for your embedding model. 50k
Knowledge Retrieval Choose OpenAI's file search or Azure AI Search Index. OpenAI's file search

For a detailed description of customizable fields and instructions, view the deployment customization guide.

Configurable Knowledge Retrieval

Configurable Knowledge Retrieval

By default, the template deploys OpenAI's file search for agent's knowledge retrieval. An agent also can perform search using the search index, deployed in Azure AI Search resource. The semantic index search represents so-called hybrid search i.e. it uses LLM to search for the relevant context in the provided index as well as embedding similarity search. This index is built from the embeddings.csv file, containing the embeddings vectors, followed by the contexts. To use index search, please set the local environment variable USE_AZURE_AI_SEARCH_SERVICE to true during the azd up command. In this case the Azure AI Search resource will be deployed and used. For more information on Azure AI serach, please see the Azure AI Search Setup Guide

To specify the model to be deployed (e.g. gpt-4o-mini, gpt-4o) when azd up is called, set the following environment variables:

azd env set AZURE_AI_CHAT_MODEL_NAME <MODEL_NAME>
azd env set AZURE_AI_CHAT_MODEL_VERSION <MODEL_VERSION>
Configure Tracing and Azure Monitor To enable tracing at Azure Monitor, set the following environment variable:
azd env set ENABLE_AZURE_MONITOR_TRACING true

To enable message contents to be included in the traces, set the following environment variable. Note that the messages may contain personally identifiable information.

azd env set AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED true

You can view the App Insights tracing in Microsoft Foundry. Select your project on the Microsoft Foundry page and then click 'Tracing'.

Quota Recommendations

Quota Recommendations

The default for the model capacity in deployment is 80k tokens for chat model and 50k for embedded model for AI Search. For optimal performance, it is recommended to increase to 100k tokens. You can change the capacity by following the steps in setting capacity and deployment SKU.

  • Navigate to the home screen of the Microsoft Foundry Portal
  • Select Quota Management buttom at the bottom of the home screen
  • In the Quota tab, click the GlobalStandard dropdown and select the model and region you are using for this accelerator to see your available quota. Please note gpt-4o-mini and text-embedding-3-small are used as default.
  • Request more quota or delete any unused model deployments as needed.

Deploying with AZD

Once you've opened the project in Codespaces or in Dev Containers or locally, you can deploy it to Azure following the following steps.

  1. (Optional) If you would like to customize the deployment to disable resources, customize resource names, customize the models or increase quota, you can follow those steps now.

    ⚠️ NOTE! For optimal performance, the recommended quota is 100k tokens per minute. If you have the capacity, we recommend increasing the quota by running the following command:

    azd env set AZURE_AI_CHAT_DEPLOYMENT_CAPACITY 100

    ⚠️ If you do not increase your quota, you may encounter rate limit issues. If needed, you can increase the quota after deployment by editing your model in the Models and Endpoints tab of the Microsoft Foundry Portal.

  2. Provision resources, build a docker image using src folder, and deploy:

    azd up
  3. You will be prompted to provide an azd environment name (like "azureaiapp"), select a subscription from your Azure account, and select a location which has quota for all the resources. Then, it will provision the resources in your account and deploy the latest code.

    • For guidance on selecting a region with quota and model availability, follow the instructions in the quota recommendations section and ensure that your model is available in your selected region by checking the list of models
    • This deployment will take 7-10 minutes to provision the resources in your account and set up the solution with sample data.
    • If you get an error or timeout with deployment, changing the location can help, as there may be availability constraints for the resources. You can do this by running azd down and deleting the .azure folder from your code, and then running azd up again and selecting a new region.

    NOTE! If you get authorization failed and/or permission related errors during the deployment, please refer to the Azure account requirements in the Prerequisites section. If you were recently granted these permissions, it may take a few minutes for the authorization to apply.

  4. When azd has finished deploying, you'll see an endpoint URI in the command output. Visit that URI, and you should see the app! 🎉

    • You can view information about your deployment with:

      azd show
  5. (Optional) Now that your app has deployed, you can view your resources in the Azure Portal and your deployments in Microsoft Foundry.

    • In the Azure Portal, navigate to your environment's resource group. The name will be rg-[your environment name]. Here, you should see your container app, storage account, and all of the other resources that are created in the deployment.
    • In the Microsoft Foundry Portal, select your project. If you navigate to the Playgrounds tab followed by Chat playground, you should be able to view your new deployment. If you navigate to the Models and Endpoints tab, you should see your AI Services connection with your model deployments.
  6. (Optional) You can use a local development server to test app changes locally. To do so, follow the steps in local deployment server after your app is deployed.

  7. (Optional) Follow this tutorial to build your changes into a Docker image and deploy to Azure Container App.

This guide provides step-by-step instructions for deploying your application using Azure Container Registry (ACR) and Azure Container Apps.

There are several ways to deploy the solution. You can deploy to run in Azure in one click, or manually, or you can deploy locally.

When Deployment is complete, follow steps in Set Up Authentication in Azure App Service to add app authentication with AAD to your web app running on Azure App Service. Alternatively, run ./scripts/setup_credential.ps1 or ./scripts/setup_credential.sh to setup basic auth with username and password.