Skip to content

Latest commit

 

History

History
927 lines (740 loc) · 26.2 KB

File metadata and controls

927 lines (740 loc) · 26.2 KB

Microsoft Foundry Integration with AZD

Chapter Navigation:

Overview

This guide demonstrates how to integrate Microsoft Foundry services with Azure Developer CLI (AZD) for streamlined AI application deployments. Microsoft Foundry provides a comprehensive platform for building, deploying, and managing AI applications, while AZD simplifies the infrastructure and deployment process.

What is Microsoft Foundry?

Microsoft Foundry is Microsoft's unified platform for AI development that includes:

  • Model Catalog: Access to state-of-the-art AI models
  • Prompt Flow: Visual designer for AI workflows
  • Microsoft Foundry Portal: Integrated development environment for AI applications
  • Deployment Options: Multiple hosting and scaling options
  • Safety and Security: Built-in responsible AI features

AZD + Microsoft Foundry: Better Together

Feature Microsoft Foundry AZD Integration Benefit
Model Deployment Manual portal deployment Automated, repeatable deployments
Infrastructure Click-through provisioning Infrastructure as Code (Bicep)
Environment Management Single environment focus Multi-environment (dev/staging/prod)
CI/CD Integration Limited Native GitHub Actions support
Cost Management Basic monitoring Environment-specific cost optimization

Prerequisites

  • Azure subscription with appropriate permissions
  • Azure Developer CLI installed
  • Access to Microsoft Foundry Models services
  • Basic familiarity with Microsoft Foundry

Current AZD baseline: These examples were reviewed against azd 1.23.12. For the AI agent workflow, use the current preview extension release and check your installed version before you begin.

Core Integration Patterns

Pattern 1: Microsoft Foundry Models Integration

Use Case: Deploy chat applications with Microsoft Foundry Models models

# azure.yaml
name: ai-chat-app
services:
  api:
    project: ./api
    host: containerapp
    env:
      - AZURE_OPENAI_ENDPOINT
      - AZURE_OPENAI_API_KEY

Infrastructure (main.bicep):

// Microsoft Foundry Models Account
resource openAIAccount 'Microsoft.CognitiveServices/accounts@2023-05-01' = {
  name: openAIAccountName
  location: location
  kind: 'OpenAI'
  sku: {
    name: 'S0'
  }
  properties: {
    customSubDomainName: openAIAccountName
    disableLocalAuth: false
  }
}

// Deploy GPT model
resource gptDeployment 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
  parent: openAIAccount
  name: 'gpt-4.1-mini'
  properties: {
    model: {
      format: 'OpenAI'
      name: 'gpt-4.1-mini'
      version: '2024-07-18'
    }
    scaleSettings: {
      scaleType: 'Standard'
      capacity: 30
    }
  }
}

Pattern 2: AI Search + RAG Integration

Use Case: Deploy retrieval-augmented generation (RAG) applications

// Azure AI Search
resource searchService 'Microsoft.Search/searchServices@2023-11-01' = {
  name: searchServiceName
  location: location
  sku: {
    name: 'basic'
  }
  properties: {
    replicaCount: 1
    partitionCount: 1
    hostingMode: 'default'
  }
}

// Connect Search with OpenAI
resource searchConnection 'Microsoft.Search/searchServices/dataConnections@2023-11-01' = {
  parent: searchService
  name: 'openai-connection'
  properties: {
    targetResourceId: openAIAccount.id
    authenticationMethod: 'managedIdentity'
  }
}

Pattern 3: Document Intelligence Integration

Use Case: Document processing and analysis workflows

// Document Intelligence service
resource documentIntelligence 'Microsoft.CognitiveServices/accounts@2023-05-01' = {
  name: documentIntelligenceName
  location: location
  kind: 'FormRecognizer'
  sku: {
    name: 'S0'
  }
  properties: {
    customSubDomainName: documentIntelligenceName
  }
}

// Storage for document processing
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: storageAccountName
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
  properties: {
    accessTier: 'Hot'
    allowBlobPublicAccess: false
  }
}

🔧 Configuration Patterns

Environment Variables Setup

Production Configuration:

# Core AI services
azd env set AZURE_OPENAI_ENDPOINT "https://your-openai.openai.azure.com/"
azd env set AZURE_SEARCH_ENDPOINT "https://your-search.search.windows.net"
azd env set AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT "https://your-formrec.cognitiveservices.azure.com/"

# Model configurations
azd env set AZURE_OPENAI_MODEL "gpt-4.1-mini"
azd env set AZURE_OPENAI_EMBEDDING_MODEL "text-embedding-3-large"

# Performance settings
azd env set AZURE_OPENAI_CAPACITY 30
azd env set AZURE_SEARCH_SKU "standard"

Development Configuration:

# Cost-optimized settings for development
azd env set AZURE_OPENAI_CAPACITY 10
azd env set AZURE_SEARCH_SKU "basic"
azd env set AZURE_DOCUMENT_INTELLIGENCE_SKU "F0"  # Free tier

Secure Configuration with Key Vault

// Key Vault for secrets
resource keyVault 'Microsoft.KeyVault/vaults@2023-02-01' = {
  name: keyVaultName
  location: location
  properties: {
    tenantId: tenant().tenantId
    sku: {
      family: 'A'
      name: 'standard'
    }
    accessPolicies: [
      {
        tenantId: tenant().tenantId
        objectId: webAppIdentity.properties.principalId
        permissions: {
          secrets: ['get']
        }
      }
    ]
  }
}

// Store OpenAI key securely
resource openAIKeySecret 'Microsoft.KeyVault/vaults/secrets@2023-02-01' = {
  parent: keyVault
  name: 'openai-api-key'
  properties: {
    value: openAIAccount.listKeys().key1
  }
}

Deployment Workflows

AZD Extensions for Foundry

AZD provides extensions that add AI-specific capabilities for working with Microsoft Foundry services:

# Install the Foundry agents extension
azd extension install azure.ai.agents

# Install the fine-tuning extension
azd extension install azure.ai.finetune

# Install the custom models extension
azd extension install azure.ai.models

# List installed extensions
azd extension list --installed

# Inspect the currently installed agent extension version
azd extension show azure.ai.agents

The AI extensions are still moving quickly in preview. If a command behaves differently than shown here, upgrade the relevant extension before troubleshooting the project itself.

Agent-First Deployment with azd ai

If you have an agent manifest, use azd ai agent init to scaffold a project wired to Foundry Agent Service:

# Initialize from an agent manifest
azd ai agent init -m agent-manifest.yaml --project-id <foundry-project-id>

# Deploy to Azure
azd up

Recent preview releases of azure.ai.agents also added template-based initialization support for azd ai agent init. If you are following newer agent samples, check the extension help for the exact flags available in your installed version.

See AZD AI CLI Commands for the full command reference and flags.

Single Command Deployment

# Deploy everything with one command
azd up

# Or deploy incrementally
azd provision  # Infrastructure only
azd deploy     # Application only

# For long-running AI app deployments in azd 1.23.11+
azd deploy --timeout 1800

Environment-Specific Deployments

# Development environment
azd env new development
azd env set AZURE_LOCATION eastus
azd env set ENVIRONMENT_TYPE dev
azd up

# Production environment
azd env new production
azd env set AZURE_LOCATION westus2
azd env set ENVIRONMENT_TYPE prod
azd env set AZURE_OPENAI_CAPACITY 100
azd up

Monitoring and Observability

Application Insights Integration

// Application Insights for AI application monitoring
resource applicationInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: applicationInsightsName
  location: location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalyticsWorkspace.id
  }
}

// Custom metrics for AI operations
resource customMetrics 'Microsoft.Insights/components/analyticsItems@2015-05-01' = {
  parent: applicationInsights
  name: 'AI-Metrics'
  properties: {
    name: 'AI Operations Metrics'
    content: '''
      requests
      | where name contains "openai"
      | summarize 
          RequestCount = count(),
          AvgDuration = avg(duration),
          SuccessRate = countif(success == true) * 100.0 / count()
      by bin(timestamp, 5m)
    '''
  }
}

Cost Monitoring

// Budget alert for AI services
resource budget 'Microsoft.Consumption/budgets@2023-05-01' = {
  name: 'ai-services-budget'
  properties: {
    timePeriod: {
      startDate: '2024-01-01'
      endDate: '2024-12-31'
    }
    timeGrain: 'Monthly'
    amount: 500
    category: 'Cost'
    notifications: {
      notification1: {
        enabled: true
        operator: 'GreaterThan'
        threshold: 80
        contactEmails: [
          'admin@company.com'
        ]
      }
    }
  }
}

🔐 Security Best Practices

Managed Identity Configuration

// Managed identity for the web application
resource webAppIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
  name: '${appName}-identity'
  location: location
}

// Assign OpenAI User role
resource openAIRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  scope: openAIAccount
  name: guid(openAIAccount.id, webAppIdentity.id, 'Cognitive Services OpenAI User')
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd')
    principalId: webAppIdentity.properties.principalId
    principalType: 'ServicePrincipal'
  }
}

Network Security

// Private endpoints for AI services
resource openAIPrivateEndpoint 'Microsoft.Network/privateEndpoints@2023-04-01' = {
  name: '${openAIAccountName}-pe'
  location: location
  properties: {
    subnet: {
      id: virtualNetwork.properties.subnets[0].id
    }
    privateLinkServiceConnections: [
      {
        name: 'openai-connection'
        properties: {
          privateLinkServiceId: openAIAccount.id
          groupIds: ['account']
        }
      }
    ]
  }
}

Performance Optimization

Caching Strategies

# azure.yaml - Redis cache integration
services:
  api:
    project: ./api
    host: containerapp
    env:
      - REDIS_CONNECTION_STRING
      - CACHE_TTL=3600
// Redis cache for AI responses
resource redisCache 'Microsoft.Cache/redis@2023-04-01' = {
  name: redisCacheName
  location: location
  properties: {
    sku: {
      name: 'Basic'
      family: 'C'
      capacity: 1
    }
    enableNonSslPort: false
    minimumTlsVersion: '1.2'
  }
}

Auto-scaling Configuration

// Container App with auto-scaling
resource containerApp 'Microsoft.App/containerApps@2023-05-01' = {
  name: containerAppName
  location: location
  properties: {
    configuration: {
      ingress: {
        external: true
        targetPort: 8000
      }
    }
    template: {
      scale: {
        minReplicas: 1
        maxReplicas: 10
        rules: [
          {
            name: 'http-scaling'
            http: {
              metadata: {
                concurrentRequests: '30'
              }
            }
          }
        ]
      }
    }
  }
}

Troubleshooting Common Issues

Issue 1: OpenAI Quota Exceeded

Symptoms:

  • Deployment fails with quota errors
  • 429 errors in application logs

Solutions:

# Check current quota usage
az cognitiveservices usage list --location eastus

# Try different region
azd env set AZURE_LOCATION westus2
azd up

# Reduce capacity temporarily
azd env set AZURE_OPENAI_CAPACITY 10
azd deploy

Issue 2: Authentication Failures

Symptoms:

  • 401/403 errors when calling AI services
  • "Access denied" messages

Solutions:

# Verify role assignments
az role assignment list --scope /subscriptions/YOUR_SUB/resourceGroups/YOUR_RG

# Check managed identity configuration
az webapp identity show --name YOUR_APP --resource-group YOUR_RG

# Validate Key Vault access
az keyvault secret show --vault-name YOUR_KV --name openai-api-key

Issue 3: Model Deployment Issues

Symptoms:

  • Models not available in deployment
  • Specific model versions failing

Solutions:

# List available models by region
az cognitiveservices model list --location eastus

# Update model version in bicep template
# Check model capacity requirements

Example Templates

RAG Chat Application (Python)

Repository: azure-search-openai-demo

Services: Azure OpenAI + Azure AI Search + Azure Container Apps + Azure Blob Storage

Description: The most popular Azure AI sample — a production-ready RAG chat app that lets you ask questions over your own documents. Uses GPT-4.1-mini for chat, text-embedding-3-large for embeddings, and Azure AI Search for retrieval. Supports multimodal documents, speech input/output, Microsoft Entra authentication, and Application Insights tracing.

Quick Start:

azd init --template azure-search-openai-demo
azd up

RAG Chat Application (.NET)

Repository: azure-search-openai-demo-csharp

Services: Azure OpenAI + Azure AI Search + Azure Container Apps + Semantic Kernel

Description: The .NET/C# equivalent of the Python RAG chat sample. Built with ASP.NET Core Minimal API and Blazor WebAssembly frontend. Includes voice chat, GPT-4o-mini vision support, and a companion .NET MAUI Blazor Hybrid desktop/mobile client.

Quick Start:

azd init --template azure-search-openai-demo-csharp
azd up

RAG Chat Application (Java)

Repository: azure-search-openai-demo-java

Services: Azure OpenAI + Azure AI Search + Azure Container Apps / AKS + Langchain4J + Azure Cosmos DB

Description: Java version of the RAG chat sample using Langchain4J for AI orchestration. Supports microservice event-driven architecture, multiple search strategies (text, vector, hybrid), document upload with Azure Document Intelligence, and deployment on either Azure Container Apps or Azure Kubernetes Service.

Quick Start:

azd init --template azure-search-openai-demo-java
azd up

Enterprise Retail Copilot with Azure AI Foundry

Repository: contoso-chat

Services: Azure OpenAI + Azure AI Foundry + Prompty + Azure AI Search + Azure Container Apps + Azure Cosmos DB

Description: End-to-end retail RAG copilot using Azure AI Foundry and Prompty. A Contoso Outdoor retailer chatbot that grounds responses in product catalog and customer order data. Demonstrates the full GenAIOps workflow — prototype with Prompty, evaluate with AI-assisted evaluators, and deploy via AZD to Container Apps.

Quick Start:

azd init --template contoso-chat
azd up

Creative Writing Multi-Agent Application

Repository: contoso-creative-writer

Services: Azure OpenAI + Azure AI Agent Service + Bing Grounding + Azure AI Search + Azure Container Apps

Description: Multi-agent sample demonstrating AI agent orchestration with Prompty. Uses a research agent (Bing Grounding in Azure AI Agent Service), a product agent (Azure AI Search), a writer agent, and an editor agent to collaboratively produce well-researched articles. Includes CI/CD with evaluation in GitHub Actions.

Quick Start:

azd init --template contoso-creative-writer
azd up

Serverless RAG Chat (JavaScript/TypeScript)

Repository: serverless-chat-langchainjs

Services: Azure OpenAI + Azure Functions + Azure Static Web Apps + Azure Cosmos DB for NoSQL + LangChain.js

Description: Fully serverless RAG chatbot using LangChain.js with Azure Functions for the API and Azure Static Web Apps for hosting. Uses Azure Cosmos DB as both vector store and chat history database. Supports local development with Ollama for zero-cost testing.

Quick Start:

azd init --template serverless-chat-langchainjs
azd up

Chat with Your Data Solution Accelerator

Repository: chat-with-your-data-solution-accelerator

Services: Azure OpenAI + Azure AI Search + Azure App Service + Azure Document Intelligence + Azure Functions + Azure Cosmos DB / PostgreSQL

Description: Enterprise-grade RAG solution accelerator with admin portal for document upload/management, multiple orchestrator options (Semantic Kernel, LangChain, Prompt Flow), speech-to-text, Microsoft Teams integration, and choice of PostgreSQL or Cosmos DB backend. Designed as a customizable starting point for production RAG scenarios.

Quick Start:

azd init --template chat-with-your-data-solution-accelerator
azd up

AI Travel Agents — Multi-Agent MCP Orchestration

Repository: azure-ai-travel-agents

Services: Azure OpenAI + Azure AI Foundry + Azure Container Apps + MCP Servers (.NET, Python, Java, TypeScript)

Description: Reference application for multi-agent AI orchestration using three frameworks (LangChain.js, LlamaIndex.TS, and Microsoft Agent Framework). Features MCP (Model Context Protocol) servers in four languages deployed as serverless Azure Container Apps with OpenTelemetry monitoring.

Quick Start:

azd init --template azure-ai-travel-agents
azd up

Azure AI Starter

Repository: azd-ai-starter

Services: Azure AI Services + Azure OpenAI

Description: Minimal Bicep template that deploys Azure AI services with configured machine learning models. A lightweight starting point when you only need the Azure AI infrastructure provisioned without a full application stack.

Quick Start:

azd init --template azd-ai-starter
azd up

Browse more templates: Visit the Awesome AZD AI Template Gallery for 80+ AI-specific AZD templates across languages and scenarios.

Next Steps

  1. Try the Examples: Start with a pre-built template that matches your use case
  2. Customize for Your Needs: Modify the infrastructure and application code
  3. Add Monitoring: Implement comprehensive observability
  4. Optimize Costs: Fine-tune configurations for your budget
  5. Secure Your Deployment: Implement enterprise security patterns
  6. Scale to Production: Add multi-region and high-availability features

🎯 Hands-On Exercises

Exercise 1: Deploy Microsoft Foundry Models Chat App (30 minutes)

Goal: Deploy and test a production-ready AI chat application

# Initialize template
mkdir ai-chat-demo && cd ai-chat-demo
azd init --template azure-search-openai-demo

# Set environment variables
azd env set AZURE_LOCATION eastus2
azd env set AZURE_OPENAI_CAPACITY 30

# Deploy
azd up

# Test the application
WEB_URL=$(azd show --output json | jq -r '.services.web.endpoint')
echo "Chat app: $WEB_URL"

# Monitor AI operations
azd monitor

# Clean up
azd down --force --purge

Success Criteria:

  • Deployment completes without quota errors
  • Can access chat interface in browser
  • Can ask questions and get AI-powered responses
  • Application Insights shows telemetry data
  • Successfully cleaned up resources

Estimated Cost: $5-10 for 30 minutes of testing

Exercise 2: Configure Multi-Model Deployment (45 minutes)

Goal: Deploy multiple AI models with different configurations

# Create custom Bicep configuration
cat > infra/ai-models.bicep << 'EOF'
param openAiAccountName string
param location string

resource openAi 'Microsoft.CognitiveServices/accounts@2023-05-01' existing = {
  name: openAiAccountName
}

// gpt-4.1-mini for general chat
resource gpt4omini 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
  parent: openAi
  name: 'gpt-4.1-mini'
  properties: {
    model: {
      format: 'OpenAI'
      name: 'gpt-4.1-mini'
      version: '2024-07-18'
    }
    scaleSettings: {
      scaleType: 'Standard'
      capacity: 30
    }
  }
}

// Text embedding for search
resource embedding 'Microsoft.CognitiveServices/accounts/deployments@2023-05-01' = {
  parent: openAi
  name: 'text-embedding-3-large'
  properties: {
    model: {
      format: 'OpenAI'
      name: 'text-embedding-3-large'
      version: '1'
    }
    scaleSettings: {
      scaleType: 'Standard'
      capacity: 50
    }
  }
  dependsOn: [gpt4omini]
}
EOF

# Deploy and verify
azd provision
azd show

Success Criteria:

  • Multiple models deployed successfully
  • Different capacity settings applied
  • Models accessible via API
  • Can call both models from application

Exercise 3: Implement Cost Monitoring (20 minutes)

Goal: Set up budget alerts and cost tracking

# Add budget alert to Bicep
cat >> infra/main.bicep << 'EOF'

resource budget 'Microsoft.Consumption/budgets@2023-05-01' = {
  name: 'ai-monthly-budget'
  properties: {
    timePeriod: {
      startDate: '2024-01-01'
      endDate: '2025-12-31'
    }
    timeGrain: 'Monthly'
    amount: 200
    category: 'Cost'
    notifications: {
      notification1: {
        enabled: true
        operator: 'GreaterThan'
        threshold: 80
        contactEmails: ['your-email@example.com']
      }
      notification2: {
        enabled: true
        operator: 'GreaterThan'
        threshold: 100
        contactEmails: ['your-email@example.com']
      }
    }
  }
}
EOF

# Deploy budget alert
azd provision

# Check current costs
az consumption usage list --start-date $(date -d '7 days ago' +%Y-%m-%d) --end-date $(date +%Y-%m-%d)

Success Criteria:

  • Budget alert created in Azure
  • Email notifications configured
  • Can view cost data in Azure Portal
  • Budget thresholds set appropriately

💡 Frequently Asked Questions

How do I reduce Microsoft Foundry Models costs during development?
  1. Use Free Tier: Microsoft Foundry Models offers 50,000 tokens/month free
  2. Reduce Capacity: Set capacity to 10 TPM instead of 30+ for dev
  3. Use azd down: Deallocate resources when not actively developing
  4. Cache Responses: Implement Redis cache for repeated queries
  5. Use Prompt Engineering: Reduce token usage with efficient prompts
# Development configuration
azd env set AZURE_OPENAI_CAPACITY 10
azd env set ENABLE_RESPONSE_CACHE true
What's the difference between Microsoft Foundry Models and OpenAI API?

Microsoft Foundry Models:

  • Enterprise security and compliance
  • Private network integration
  • SLA guarantees
  • Managed identity authentication
  • Higher quotas available

OpenAI API:

  • Faster access to new models
  • Simpler setup
  • Lower barrier to entry
  • Public internet only

For production apps, Microsoft Foundry Models is recommended.

How do I handle Microsoft Foundry Models quota exceeded errors?
# Check current quota
az cognitiveservices usage list --location eastus2

# Try different region
azd env set AZURE_LOCATION westus2
azd up

# Reduce capacity temporarily
azd env set AZURE_OPENAI_CAPACITY 10
azd provision

# Request quota increase
# Go to Azure Portal > Quotas > Request increase
Can I use my own data with Microsoft Foundry Models?

Yes! Use Azure AI Search for RAG (Retrieval Augmented Generation):

# azure.yaml
services:
  ai:
    env:
      - AZURE_SEARCH_ENDPOINT
      - AZURE_SEARCH_INDEX
      - AZURE_OPENAI_ENDPOINT

See the azure-search-openai-demo template.

How do I secure AI model endpoints?

Best Practices:

  1. Use Managed Identity (no API keys)
  2. Enable Private Endpoints
  3. Configure network security groups
  4. Implement rate limiting
  5. Use Azure Key Vault for secrets
// Managed Identity authentication
resource webAppIdentity 'Microsoft.ManagedIdentity/userAssignedIdentities@2023-01-31' = {
  name: 'web-identity'
  location: location
}

resource openAIRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  scope: openAIAccount
  name: guid(openAIAccount.id, webAppIdentity.id)
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '5e0bd9bd-7b93-4f28-af87-19fc36ad61bd')
    principalId: webAppIdentity.properties.principalId
  }
}

Community and Support


Chapter Navigation:

Need Help? Join our community discussions or open an issue in the repository. The Azure AI + AZD community is here to help you succeed!