Chapter Navigation:
- 📚 Course Home: AZD For Beginners
- 📖 Current Chapter: Chapter 8 - Production & Enterprise Patterns
- ⬅️ Previous Chapter: Chapter 7: Troubleshooting
- ⬅️ Also Related: AI Workshop Lab
- 🎯 Course Complete: AZD For Beginners
This guide provides comprehensive best practices for deploying production-ready AI workloads using Azure Developer CLI (AZD). Based on feedback from the Microsoft Foundry Discord community and real-world customer deployments, these practices address the most common challenges in production AI systems.
Based on our community poll results, these are the top challenges developers face:
- 45% struggle with multi-service AI deployments
- 38% have issues with credential and secret management
- 35% find production readiness and scaling difficult
- 32% need better cost optimization strategies
- 29% require improved monitoring and troubleshooting
When to use: Complex AI applications with multiple capabilities
graph TD
Frontend[Web Frontend] --- Gateway[API Gateway] --- LB[Load Balancer]
Gateway --> Chat[Chat Service]
Gateway --> Image[Image Service]
Gateway --> Text[Text Service]
Chat --> OpenAI[Microsoft Foundry Models]
Image --> Vision[Computer Vision]
Text --> DocIntel[Document Intelligence]
AZD Implementation:
# azure.yaml
name: enterprise-ai-platform
services:
web:
project: ./web
host: staticwebapp
api-gateway:
project: ./api-gateway
host: containerapp
chat-service:
project: ./services/chat
host: containerapp
vision-service:
project: ./services/vision
host: containerapp
text-service:
project: ./services/text
host: containerappWhen to use: Batch processing, document analysis, async workflows
// Event Hub for AI processing pipeline
resource eventHub 'Microsoft.EventHub/namespaces@2023-01-01-preview' = {
name: eventHubNamespaceName
location: location
sku: {
name: 'Standard'
tier: 'Standard'
capacity: 1
}
}
// Service Bus for reliable message processing
resource serviceBus 'Microsoft.ServiceBus/namespaces@2022-10-01-preview' = {
name: serviceBusNamespaceName
location: location
sku: {
name: 'Premium'
tier: 'Premium'
capacity: 1
}
}
// Function App for processing
resource functionApp 'Microsoft.Web/sites@2023-01-01' = {
name: functionAppName
location: location
kind: 'functionapp,linux'
properties: {
siteConfig: {
appSettings: [
{
name: 'FUNCTIONS_EXTENSION_VERSION'
value: '~4'
}
{
name: 'AZURE_OPENAI_ENDPOINT'
value: '@Microsoft.KeyVault(VaultName=${keyVault.name};SecretName=openai-endpoint)'
}
]
}
}
}When a traditional web app breaks, the symptoms are familiar: a page doesn't load, an API returns an error, or a deployment fails. AI-powered applications can break in all those same ways—but they can also misbehave in subtler ways that don't produce obvious error messages.
This section helps you build a mental model for monitoring AI workloads so you know where to look when things don't seem right.
A traditional app either works or it doesn't. An AI agent can appear to work but produce poor results. Think of agent health in two layers:
| Layer | What to Watch | Where to Look |
|---|---|---|
| Infrastructure health | Is the service running? Are resources provisioned? Are endpoints reachable? | azd monitor, Azure Portal resource health, container/app logs |
| Behavior health | Is the agent responding accurately? Are responses timely? Is the model being called correctly? | Application Insights traces, model call latency metrics, response quality logs |
Infrastructure health is familiar—it's the same for any azd app. Behavior health is the new layer that AI workloads introduce.
If your AI application isn't producing the results you expect, here's a conceptual checklist:
- Start with the basics. Is the app running? Can it reach its dependencies? Check
azd monitorand resource health just as you would for any app. - Check the model connection. Is your application successfully calling the AI model? Failed or timed-out model calls are the most common cause of AI app issues and will show up in your application logs.
- Look at what the model received. AI responses depend on the input (the prompt and any retrieved context). If the output is wrong, the input is usually wrong. Check whether your application is sending the right data to the model.
- Review response latency. AI model calls are slower than typical API calls. If your app feels sluggish, check whether model response times have increased—this can indicate throttling, capacity limits, or region-level congestion.
- Watch for cost signals. Unexpected spikes in token usage or API calls can indicate a loop, a misconfigured prompt, or excessive retries.
You don't need to master observability tooling right away. The key takeaway is that AI applications have an extra layer of behavior to monitor, and azd's built-in monitoring (azd monitor) gives you a starting point for investigating both layers.
Implementation Strategy:
- No service-to-service communication without authentication
- All API calls use managed identities
- Network isolation with private endpoints
- Least privilege access controls
// Managed Identity for each service
resource chatServiceIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
name: 'chat-service-identity'
location: location
}
// Role assignments with minimal permissions
resource openAIUserRole 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
scope: openAIAccount
name: guid(openAIAccount.id, chatServiceIdentity.id, openAIUserRoleDefinitionId)
properties: {
roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd')
principalId: chatServiceIdentity.properties.principalId
principalType: 'ServicePrincipal'
}
}Key Vault Integration Pattern:
// Key Vault with proper access policies
resource keyVault 'Microsoft.KeyVault/vaults@2023-02-01' = {
name: keyVaultName
location: location
properties: {
tenantId: tenant().tenantId
sku: {
family: 'A'
name: 'premium' // Use premium for production
}
enableRbacAuthorization: true // Use RBAC instead of access policies
enablePurgeProtection: true // Prevent accidental deletion
enableSoftDelete: true
softDeleteRetentionInDays: 90
}
}
// Store all AI service credentials
resource openAIKeySecret 'Microsoft.KeyVault/vaults/secrets@2023-02-01' = {
parent: keyVault
name: 'openai-api-key'
properties: {
value: openAIAccount.listKeys().key1
attributes: {
enabled: true
}
}
}Private Endpoint Configuration:
// Virtual Network for AI services
resource virtualNetwork 'Microsoft.Network/virtualNetworks@2023-04-01' = {
name: vnetName
location: location
properties: {
addressSpace: {
addressPrefixes: ['10.0.0.0/16']
}
subnets: [
{
name: 'ai-services-subnet'
properties: {
addressPrefix: '10.0.1.0/24'
privateEndpointNetworkPolicies: 'Disabled'
}
}
{
name: 'app-services-subnet'
properties: {
addressPrefix: '10.0.2.0/24'
delegations: [
{
name: 'Microsoft.Web/serverFarms'
properties: {
serviceName: 'Microsoft.Web/serverFarms'
}
}
]
}
}
]
}
}
// Private endpoints for all AI services
resource openAIPrivateEndpoint 'Microsoft.Network/privateEndpoints@2023-04-01' = {
name: '${openAIAccountName}-pe'
location: location
properties: {
subnet: {
id: virtualNetwork.properties.subnets[0].id
}
privateLinkServiceConnections: [
{
name: 'openai-connection'
properties: {
privateLinkServiceId: openAIAccount.id
groupIds: ['account']
}
}
]
}
}Container Apps Auto-scaling:
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
name: containerAppName
location: location
properties: {
configuration: {
ingress: {
external: true
targetPort: 8000
transport: 'http'
}
}
template: {
scale: {
minReplicas: 2 // Always have 2 instances minimum
maxReplicas: 50 // Scale up to 50 for high load
rules: [
{
name: 'http-scaling'
http: {
metadata: {
concurrentRequests: '20' // Scale when >20 concurrent requests
}
}
}
{
name: 'cpu-scaling'
custom: {
type: 'cpu'
metadata: {
type: 'Utilization'
value: '70' // Scale when CPU >70%
}
}
}
]
}
}
}
}Redis Cache for AI Responses:
// Redis Premium for production workloads
resource redisCache 'Microsoft.Cache/redis@2023-04-01' = {
name: redisCacheName
location: location
properties: {
sku: {
name: 'Premium'
family: 'P'
capacity: 1
}
enableNonSslPort: false
minimumTlsVersion: '1.2'
redisConfiguration: {
'maxmemory-policy': 'allkeys-lru'
}
// Enable clustering for high availability
redisVersion: '6.0'
shardCount: 2
}
}
// Cache configuration in application
var cacheConnectionString = '${redisCache.properties.hostName}:6380,password=${redisCache.listKeys().primaryKey},ssl=True,abortConnect=False'Application Gateway with WAF:
// Application Gateway with Web Application Firewall
resource applicationGateway 'Microsoft.Network/applicationGateways@2023-04-01' = {
name: appGatewayName
location: location
properties: {
sku: {
name: 'WAF_v2'
tier: 'WAF_v2'
capacity: 2
}
webApplicationFirewallConfiguration: {
enabled: true
firewallMode: 'Prevention'
ruleSetType: 'OWASP'
ruleSetVersion: '3.2'
}
// Backend pools for AI services
backendAddressPools: [
{
name: 'ai-services-pool'
properties: {
backendAddresses: [
{
fqdn: '${containerApp.properties.configuration.ingress.fqdn}'
}
]
}
}
]
}
}Environment-Specific Configurations:
# Development environment
azd env new development
azd env set AZURE_OPENAI_SKU "S0"
azd env set AZURE_OPENAI_CAPACITY 10
azd env set AZURE_SEARCH_SKU "basic"
azd env set CONTAINER_CPU 0.5
azd env set CONTAINER_MEMORY 1.0
# Production environment
azd env new production
azd env set AZURE_OPENAI_SKU "S0"
azd env set AZURE_OPENAI_CAPACITY 100
azd env set AZURE_SEARCH_SKU "standard"
azd env set CONTAINER_CPU 2.0
azd env set CONTAINER_MEMORY 4.0// Cost management and budgets
resource budget 'Microsoft.Consumption/budgets@2023-05-01' = {
name: 'ai-workload-budget'
properties: {
timePeriod: {
startDate: '2024-01-01'
endDate: '2024-12-31'
}
timeGrain: 'Monthly'
amount: 2000 // $2000 monthly budget
category: 'Cost'
notifications: {
warning: {
enabled: true
operator: 'GreaterThan'
threshold: 80
contactEmails: [
'finance@company.com'
'engineering@company.com'
]
contactRoles: [
'Owner'
'Contributor'
]
}
critical: {
enabled: true
operator: 'GreaterThan'
threshold: 95
contactEmails: [
'cto@company.com'
]
}
}
}
}OpenAI Cost Management:
// Application-level token optimization
class TokenOptimizer {
private readonly maxTokens = 4000;
private readonly reserveTokens = 500;
optimizePrompt(userInput: string, context: string): string {
const availableTokens = this.maxTokens - this.reserveTokens;
const estimatedTokens = this.estimateTokens(userInput + context);
if (estimatedTokens > availableTokens) {
// Truncate context, not user input
context = this.truncateContext(context, availableTokens - this.estimateTokens(userInput));
}
return `${context}\n\nUser: ${userInput}`;
}
private estimateTokens(text: string): number {
// Rough estimation: 1 token ≈ 4 characters
return Math.ceil(text.length / 4);
}
}// Application Insights with advanced features
resource applicationInsights 'Microsoft.Insights/components@2020-02-02' = {
name: applicationInsightsName
location: location
kind: 'web'
properties: {
Application_Type: 'web'
WorkspaceResourceId: logAnalyticsWorkspace.id
SamplingPercentage: 100 // Full sampling for AI apps
DisableIpMasking: false // Enable for security
}
}
// Custom metrics for AI operations
resource aiMetricAlerts 'Microsoft.Insights/metricAlerts@2018-03-01' = {
name: 'ai-high-error-rate'
location: 'global'
properties: {
description: 'Alert when AI service error rate is high'
severity: 2
enabled: true
scopes: [
applicationInsights.id
]
evaluationFrequency: 'PT1M'
windowSize: 'PT5M'
criteria: {
'odata.type': 'Microsoft.Azure.Monitor.SingleResourceMultipleMetricCriteria'
allOf: [
{
name: 'high-error-rate'
metricName: 'requests/failed'
operator: 'GreaterThan'
threshold: 10
timeAggregation: 'Count'
}
]
}
}
}Custom Dashboards for AI Metrics:
// Dashboard configuration for AI workloads
{
"dashboard": {
"name": "AI Application Monitoring",
"tiles": [
{
"name": "OpenAI Request Volume",
"query": "requests | where name contains 'openai' | summarize count() by bin(timestamp, 5m)"
},
{
"name": "AI Response Latency",
"query": "requests | where name contains 'openai' | summarize avg(duration) by bin(timestamp, 5m)"
},
{
"name": "Token Usage",
"query": "customMetrics | where name == 'openai_tokens_used' | summarize sum(value) by bin(timestamp, 1h)"
},
{
"name": "Cost per Hour",
"query": "customMetrics | where name == 'openai_cost' | summarize sum(value) by bin(timestamp, 1h)"
}
]
}
}// Application Insights availability tests
resource availabilityTest 'Microsoft.Insights/webtests@2022-06-15' = {
name: 'ai-app-availability-test'
location: location
tags: {
'hidden-link:${applicationInsights.id}': 'Resource'
}
properties: {
SyntheticMonitorId: 'ai-app-availability-test'
Name: 'AI Application Availability Test'
Description: 'Tests AI application endpoints'
Enabled: true
Frequency: 300 // 5 minutes
Timeout: 120 // 2 minutes
Kind: 'ping'
Locations: [
{
Id: 'us-east-2-azr'
}
{
Id: 'us-west-2-azr'
}
]
Configuration: {
WebTest: '''
<WebTest Name="AI Health Check"
Id="8d2de8d2-a2b0-4c2e-9a0d-8f9c9a0b8c8d"
Enabled="True"
CssProjectStructure=""
CssIteration=""
Timeout="120"
WorkItemIds=""
xmlns="http://microsoft.com/schemas/VisualStudio/TeamTest/2010"
Description=""
CredentialUserName=""
CredentialPassword=""
PreAuthenticate="True"
Proxy="default"
StopOnError="False"
RecordedResultFile=""
ResultsLocale="">
<Items>
<Request Method="GET"
Guid="a5f10126-e4cd-570d-961c-cea43999a200"
Version="1.1"
Url="${webApp.properties.defaultHostName}/health"
ThinkTime="0"
Timeout="120"
ParseDependentRequests="True"
FollowRedirects="True"
RecordResult="True"
Cache="False"
ResponseTimeGoal="0"
Encoding="utf-8"
ExpectedHttpStatusCode="200"
ExpectedResponseUrl=""
ReportingName=""
IgnoreHttpStatusCode="False" />
</Items>
</WebTest>
'''
}
}
}# azure.yaml - Multi-region configuration
name: ai-app-multiregion
services:
api-primary:
project: ./api
host: containerapp
env:
- AZURE_REGION=eastus
api-secondary:
project: ./api
host: containerapp
env:
- AZURE_REGION=westus2// Traffic Manager for global load balancing
resource trafficManager 'Microsoft.Network/trafficManagerProfiles@2022-04-01' = {
name: trafficManagerProfileName
location: 'global'
properties: {
profileStatus: 'Enabled'
trafficRoutingMethod: 'Priority'
dnsConfig: {
relativeName: trafficManagerProfileName
ttl: 30
}
monitorConfig: {
protocol: 'HTTPS'
port: 443
path: '/health'
intervalInSeconds: 30
toleratedNumberOfFailures: 3
timeoutInSeconds: 10
}
endpoints: [
{
name: 'primary-endpoint'
type: 'Microsoft.Network/trafficManagerProfiles/azureEndpoints'
properties: {
targetResourceId: primaryAppService.id
endpointStatus: 'Enabled'
priority: 1
}
}
{
name: 'secondary-endpoint'
type: 'Microsoft.Network/trafficManagerProfiles/azureEndpoints'
properties: {
targetResourceId: secondaryAppService.id
endpointStatus: 'Enabled'
priority: 2
}
}
]
}
}// Backup configuration for critical data
resource backupVault 'Microsoft.DataProtection/backupVaults@2023-05-01' = {
name: backupVaultName
location: location
identity: {
type: 'SystemAssigned'
}
properties: {
storageSettings: [
{
datastoreType: 'VaultStore'
type: 'LocallyRedundant'
}
]
}
}
// Backup policy for AI models and data
resource backupPolicy 'Microsoft.DataProtection/backupVaults/backupPolicies@2023-05-01' = {
parent: backupVault
name: 'ai-data-backup-policy'
properties: {
policyRules: [
{
backupParameters: {
backupType: 'Full'
objectType: 'AzureBackupParams'
}
trigger: {
schedule: {
repeatingTimeIntervals: [
'R/2024-01-01T02:00:00+00:00/P1D' // Daily at 2 AM
]
}
objectType: 'ScheduleBasedTriggerContext'
}
dataStore: {
datastoreType: 'VaultStore'
objectType: 'DataStoreInfoBase'
}
name: 'BackupDaily'
objectType: 'AzureBackupRule'
}
]
}
}# .github/workflows/deploy-ai-app.yml
name: Deploy AI Application
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.11'
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install pytest
- name: Run tests
run: pytest tests/
- name: AI Safety Tests
run: |
python scripts/test_ai_safety.py
python scripts/validate_prompts.py
deploy-staging:
needs: test
if: github.event_name == 'pull_request'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup AZD
uses: Azure/setup-azd@v2
- name: Login to Azure
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Deploy to Staging
run: |
azd env select staging
azd deploy
deploy-production:
needs: test
if: github.ref == 'refs/heads/main'
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup AZD
uses: Azure/setup-azd@v2
- name: Login to Azure
uses: azure/login@v1
with:
creds: ${{ secrets.AZURE_CREDENTIALS }}
- name: Deploy to Production
run: |
azd env select production
azd deploy
- name: Run Production Health Checks
run: |
python scripts/health_check.py --env production# scripts/validate_infrastructure.sh
#!/bin/bash
echo "Validating AI infrastructure deployment..."
# Check if all required services are running
services=("openai" "search" "storage" "keyvault")
for service in "${services[@]}"; do
echo "Checking $service..."
if ! az resource list --resource-type "Microsoft.CognitiveServices/accounts" --query "[?contains(name, '$service')]" -o tsv; then
echo "ERROR: $service not found"
exit 1
fi
done
# Validate OpenAI model deployments
echo "Validating OpenAI model deployments..."
models=$(az cognitiveservices account deployment list --name $AZURE_OPENAI_NAME --resource-group $AZURE_RESOURCE_GROUP --query "[].name" -o tsv)
if [[ ! $models == *"gpt-4.1-mini"* ]]; then
echo "ERROR: Required model gpt-4.1-mini not deployed"
exit 1
fi
# Test AI service connectivity
echo "Testing AI service connectivity..."
python scripts/test_connectivity.py
echo "Infrastructure validation completed successfully!"- All services use managed identities
- Secrets stored in Key Vault
- Private endpoints configured
- Network security groups implemented
- RBAC with least privilege
- WAF enabled on public endpoints
- Auto-scaling configured
- Caching implemented
- Load balancing setup
- CDN for static content
- Database connection pooling
- Token usage optimization
- Application Insights configured
- Custom metrics defined
- Alerting rules setup
- Dashboard created
- Health checks implemented
- Log retention policies
- Multi-region deployment
- Backup and recovery plan
- Circuit breakers implemented
- Retry policies configured
- Graceful degradation
- Health check endpoints
- Budget alerts configured
- Resource right-sizing
- Dev/test discounts applied
- Reserved instances purchased
- Cost monitoring dashboard
- Regular cost reviews
- Data residency requirements met
- Audit logging enabled
- Compliance policies applied
- Security baselines implemented
- Regular security assessments
- Incident response plan
| Metric | Target | Monitoring |
|---|---|---|
| Response Time | < 2 seconds | Application Insights |
| Availability | 99.9% | Uptime monitoring |
| Error Rate | < 0.1% | Application logs |
| Token Usage | < $500/month | Cost management |
| Concurrent Users | 1000+ | Load testing |
| Recovery Time | < 1 hour | Disaster recovery tests |
# Load testing script for AI applications
python scripts/load_test.py \
--endpoint https://your-ai-app.azurewebsites.net \
--concurrent-users 100 \
--duration 300 \
--ramp-up 60Based on Microsoft Foundry Discord community feedback:
- Start Small, Scale Gradually: Begin with basic SKUs and scale up based on actual usage
- Monitor Everything: Set up comprehensive monitoring from day one
- Automate Security: Use infrastructure as code for consistent security
- Test Thoroughly: Include AI-specific testing in your pipeline
- Plan for Costs: Monitor token usage and set budget alerts early
- ❌ Hardcoding API keys in code
- ❌ Not setting up proper monitoring
- ❌ Ignoring cost optimization
- ❌ Not testing failure scenarios
- ❌ Deploying without health checks
AZD includes a growing set of AI-specific commands and extensions that streamline production AI workflows. These tools bridge the gap between local development and production deployment for AI workloads.
AZD uses an extension system to add AI-specific capabilities. Install and manage extensions with:
# List all available extensions (including AI)
azd extension list
# Inspect installed extension details
azd extension show azure.ai.agents
# Install the Foundry agents extension
azd extension install azure.ai.agents
# Install the fine-tuning extension
azd extension install azure.ai.finetune
# Install the custom models extension
azd extension install azure.ai.models
# Upgrade all installed extensions
azd extension upgrade --allAvailable AI extensions:
| Extension | Purpose | Status |
|---|---|---|
azure.ai.agents |
Foundry Agent Service management | Preview |
azure.ai.finetune |
Foundry model fine-tuning | Preview |
azure.ai.models |
Foundry custom models | Preview |
azure.coding-agent |
Coding agent configuration | Available |
The azd ai agent init command scaffolds a production-ready AI agent project integrated with Microsoft Foundry Agent Service:
# Initialize a new agent project from an agent manifest
azd ai agent init -m <manifest-path-or-uri>
# Initialize and target a specific Foundry project
azd ai agent init -m agent-manifest.yaml --project-id <foundry-project-id>
# Initialize with a custom source directory
azd ai agent init -m agent-manifest.yaml --src ./agents/my-agent
# Target Container Apps as the host
azd ai agent init -m agent-manifest.yaml --host containerappKey flags:
| Flag | Description |
|---|---|
-m, --manifest |
Path or URI to an agent manifest to add to your project |
-p, --project-id |
Existing Microsoft Foundry Project ID for your azd environment |
-s, --src |
Directory to download the agent definition (defaults to src/<agent-id>) |
--host |
Override the default host (e.g., containerapp) |
-e, --environment |
The azd environment to use |
Production tip: Use --project-id to connect directly to an existing Foundry project, keeping your agent code and cloud resources linked from the start.
AZD includes built-in MCP server support (Alpha), enabling AI agents and tools to interact with your Azure resources through a standardized protocol:
# Start the MCP server for your project
azd mcp start
# Review current Copilot consent rules for tool execution
azd copilot consent listThe MCP server exposes your azd project context—environments, services, and Azure resources—to AI-powered development tools. This enables:
- AI-assisted deployment: Let coding agents query your project state and trigger deployments
- Resource discovery: AI tools can discover what Azure resources your project uses
- Environment management: Agents can switch between dev/staging/production environments
For production AI workloads, you can generate and customize Infrastructure as Code rather than relying on automatic provisioning:
# Generate Bicep/Terraform files from your project definition
azd infra generateThis writes IaC to disk so you can:
- Review and audit infrastructure before deploying
- Add custom security policies (network rules, private endpoints)
- Integrate with existing IaC review processes
- Version control infrastructure changes separately from application code
AZD hooks let you inject custom logic at every stage of the deployment lifecycle—critical for production AI workflows:
# azure.yaml - Production hooks example
name: ai-production-app
hooks:
preprovision:
shell: sh
run: scripts/validate-quotas.sh # Check AI model quota before provisioning
postprovision:
shell: sh
run: scripts/configure-networking.sh # Set up private endpoints
predeploy:
shell: sh
run: scripts/run-ai-safety-tests.sh # Run prompt safety checks
postdeploy:
shell: sh
run: scripts/smoke-test.sh # Verify agent responses post-deploy
services:
agent-api:
project: ./src/agent
host: containerapp
hooks:
predeploy:
shell: sh
run: scripts/validate-model-access.sh # Per-service hook# Run a specific hook manually during development
azd hooks run predeployRecommended production hooks for AI workloads:
| Hook | Use Case |
|---|---|
preprovision |
Validate subscription quotas for AI model capacity |
postprovision |
Configure private endpoints, deploy model weights |
predeploy |
Run AI safety tests, validate prompt templates |
postdeploy |
Smoke test agent responses, verify model connectivity |
Use azd pipeline config to connect your project to GitHub Actions or Azure Pipelines with secure Azure authentication:
# Configure CI/CD pipeline (interactive)
azd pipeline config
# Configure with a specific provider
azd pipeline config --provider githubThis command:
- Creates a service principal with least-privilege access
- Configures federated credentials (no stored secrets)
- Generates or updates your pipeline definition file
- Sets required environment variables in your CI/CD system
Production workflow with pipeline config:
# 1. Set up production environment
azd env new production
azd env set AZURE_OPENAI_CAPACITY 100
# 2. Configure the pipeline
azd pipeline config --provider github
# 3. Pipeline runs azd deploy on every push to mainIncrementally add Azure services to an existing project:
# Add a new service component interactively
azd addThis is particularly useful for expanding production AI applications—for example, adding a vector search service, a new agent endpoint, or a monitoring component to an existing deployment.
- Azure Well-Architected Framework: AI workload guidance
- Microsoft Foundry Documentation: Official docs
- Community Templates: Azure Samples
- Discord Community: #Azure channel
- Agent Skills for Azure: microsoft/github-copilot-for-azure on skills.sh - 37 open agent skills for Azure AI, Foundry, deployment, cost optimization, and diagnostics. Install in your editor:
npx skills add microsoft/github-copilot-for-azure
Chapter Navigation:
- 📚 Course Home: AZD For Beginners
- 📖 Current Chapter: Chapter 8 - Production & Enterprise Patterns
- ⬅️ Previous Chapter: Chapter 7: Troubleshooting
- ⬅️ Also Related: AI Workshop Lab
- � Course Complete: AZD For Beginners
Remember: Production AI workloads require careful planning, monitoring, and continuous optimization. Start with these patterns and adapt them to your specific requirements.