Skip to content

Latest commit

 

History

History
818 lines (680 loc) · 20.6 KB

File metadata and controls

818 lines (680 loc) · 20.6 KB

SKU Selection Guide - Choosing the Right Azure Service Tiers

Chapter Navigation:

Introduction

This comprehensive guide helps you select optimal Azure service SKUs (Stock Keeping Units) for different environments, workloads, and requirements. Learn to analyze performance needs, cost considerations, and scalability requirements to choose the most appropriate service tiers for your Azure Developer CLI deployments.

Learning Goals

By completing this guide, you will:

  • Understand Azure SKU concepts, pricing models, and feature differences
  • Master environment-specific SKU selection strategies for development, staging, and production
  • Analyze workload requirements and match them to appropriate service tiers
  • Implement cost optimization strategies through intelligent SKU selection
  • Apply performance testing and validation techniques for SKU choices
  • Configure automated SKU recommendations and monitoring

Learning Outcomes

Upon completion, you will be able to:

  • Select appropriate Azure service SKUs based on workload requirements and constraints
  • Design cost-effective multi-environment architectures with proper tier selection
  • Implement performance benchmarking and validation for SKU choices
  • Create automated tools for SKU recommendation and cost optimization
  • Plan SKU migrations and scaling strategies for changing requirements
  • Apply Azure Well-Architected Framework principles to service tier selection

Table of Contents


Understanding SKUs

What are SKUs?

SKUs (Stock Keeping Units) represent different service tiers and performance levels for Azure resources. Each SKU offers different:

  • Performance characteristics (CPU, memory, throughput)
  • Feature availability (scaling options, SLA levels)
  • Pricing models (consumption-based, reserved capacity)
  • Regional availability (not all SKUs available in all regions)

Key Factors in SKU Selection

  1. Workload Requirements

    • Expected traffic/load patterns
    • Performance requirements (CPU, memory, I/O)
    • Storage needs and access patterns
  2. Environment Type

    • Development/testing vs. production
    • Availability requirements
    • Security and compliance needs
  3. Budget Constraints

    • Initial costs vs. operational costs
    • Reserved capacity discounts
    • Auto-scaling cost implications
  4. Growth Projections

    • Scalability requirements
    • Future feature needs
    • Migration complexity

Environment-Based Selection

Development Environment

Priorities: Cost optimization, basic functionality, easy provisioning/de-provisioning

Recommended SKUs

# Development environment configuration
environment: development
skus:
  app_service: "F1"          # Free tier
  sql_database: "Basic"       # Basic tier, 5 DTU
  storage: "Standard_LRS"     # Locally redundant
  cosmos_db: "Free"          # Free tier (400 RU/s)
  key_vault: "Standard"      # Standard pricing tier
  application_insights: "Free" # First 5GB free

Characteristics

  • App Service: F1 (Free) or B1 (Basic) for simple testing
  • Databases: Basic tier with minimal resources
  • Storage: Standard with local redundancy only
  • Compute: Shared resources acceptable
  • Networking: Basic configurations

Staging/Testing Environment

Priorities: Production-like configuration, cost balance, performance testing capability

Recommended SKUs

# Staging environment configuration
environment: staging
skus:
  app_service: "S1"          # Standard tier
  sql_database: "S2"         # Standard tier, 50 DTU
  storage: "Standard_GRS"    # Geo-redundant
  cosmos_db: "Standard"      # 400 RU/s provisioned
  container_apps: "Consumption" # Pay-per-use

Characteristics

  • Performance: 70-80% of production capacity
  • Features: Most production features enabled
  • Redundancy: Some geographic redundancy
  • Scaling: Limited auto-scaling for testing
  • Monitoring: Full monitoring stack

Production Environment

Priorities: Performance, availability, security, compliance, scalability

Recommended SKUs

# Production environment configuration
environment: production
skus:
  app_service: "P1V3"        # Premium v3 tier
  sql_database: "P2"         # Premium tier, 250 DTU
  storage: "Premium_GRS"     # Premium geo-redundant
  cosmos_db: "Provisioned"   # Dedicated throughput
  container_apps: "Dedicated" # Dedicated environment
  key_vault: "Premium"       # Premium with HSM

Characteristics

  • High availability: 99.9%+ SLA requirements
  • Performance: Dedicated resources, high throughput
  • Security: Premium security features
  • Scaling: Full auto-scaling capabilities
  • Monitoring: Comprehensive observability

Service-Specific Guidelines

Azure App Service

SKU Decision Matrix

Use Case Recommended SKU Rationale
Development/Testing F1 (Free) or B1 (Basic) Cost-effective, sufficient for testing
Small production apps S1 (Standard) Custom domains, SSL, auto-scaling
Medium production apps P1V3 (Premium V3) Better performance, more features
High-traffic apps P2V3 or P3V3 Dedicated resources, high performance
Mission-critical apps I1V2 (Isolated V2) Network isolation, dedicated hardware

Configuration Examples

Development

resource appServicePlan 'Microsoft.Web/serverfarms@2022-03-01' = {
  name: 'asp-${environmentName}-dev'
  location: location
  sku: {
    name: 'F1'
    tier: 'Free'
    capacity: 1
  }
  properties: {
    reserved: false
  }
}

Production

resource appServicePlan 'Microsoft.Web/serverfarms@2022-03-01' = {
  name: 'asp-${environmentName}-prod'
  location: location
  sku: {
    name: 'P1V3'
    tier: 'PremiumV3'
    capacity: 3
  }
  properties: {
    reserved: false
  }
}

Azure SQL Database

SKU Selection Framework

  1. DTU-based (Database Transaction Units)

    • Basic: 5 DTU - Development/testing
    • Standard: S0-S12 (10-3000 DTU) - General purpose
    • Premium: P1-P15 (125-4000 DTU) - Performance-critical
  2. vCore-based (Recommended for production)

    • General Purpose: Balanced compute and storage
    • Business Critical: Low latency, high IOPS
    • Hyperscale: Highly scalable storage (up to 100TB)

Example Configurations

// Development
resource sqlDatabase 'Microsoft.Sql/servers/databases@2022-05-01-preview' = {
  name: 'db-${environmentName}-dev'
  parent: sqlServer
  location: location
  sku: {
    name: 'Basic'
    tier: 'Basic'
    capacity: 5
  }
  properties: {
    maxSizeBytes: 2147483648 // 2GB
  }
}

// Production
resource sqlDatabase 'Microsoft.Sql/servers/databases@2022-05-01-preview' = {
  name: 'db-${environmentName}-prod'
  parent: sqlServer
  location: location
  sku: {
    name: 'GP_Gen5'
    tier: 'GeneralPurpose'
    family: 'Gen5'
    capacity: 4
  }
  properties: {
    maxSizeBytes: 536870912000 // 500GB
  }
}

Azure Container Apps

Environment Types

  1. Consumption-based

    • Pay-per-use pricing
    • Suitable for development and variable workloads
    • Shared infrastructure
  2. Dedicated (Workload Profiles)

    • Dedicated compute resources
    • Predictable performance
    • Better for production workloads

Configuration Examples

Development (Consumption)

resource containerAppEnvironment 'Microsoft.App/managedEnvironments@2022-10-01' = {
  name: 'cae-${environmentName}-dev'
  location: location
  properties: {
    zoneRedundant: false
  }
}

resource containerApp 'Microsoft.App/containerApps@2022-10-01' = {
  name: 'ca-${environmentName}-dev'
  location: location
  properties: {
    managedEnvironmentId: containerAppEnvironment.id
    configuration: {
      ingress: {
        external: true
        targetPort: 3000
      }
    }
    template: {
      containers: [{
        name: 'main'
        image: 'nginx:latest'
        resources: {
          cpu: json('0.25')
          memory: '0.5Gi'
        }
      }]
      scale: {
        minReplicas: 0
        maxReplicas: 1
      }
    }
  }
}

Production (Dedicated)

resource containerAppEnvironment 'Microsoft.App/managedEnvironments@2022-10-01' = {
  name: 'cae-${environmentName}-prod'
  location: location
  properties: {
    zoneRedundant: true
    workloadProfiles: [{
      name: 'production-profile'
      workloadProfileType: 'D4'
      minimumCount: 2
      maximumCount: 10
    }]
  }
}

Azure Cosmos DB

Throughput Models

  1. Manual Provisioned Throughput

    • Predictable performance
    • Reserved capacity discounts
    • Best for steady workloads
  2. Autoscale Provisioned Throughput

    • Automatic scaling based on usage
    • Pay for what you use (with minimum)
    • Good for variable workloads
  3. Serverless

    • Pay-per-request
    • No provisioned throughput
    • Ideal for development and intermittent workloads

SKU Examples

// Development - Serverless
resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2023-04-15' = {
  name: 'cosmos-${environmentName}-dev'
  location: location
  properties: {
    databaseAccountOfferType: 'Standard'
    locations: [{
      locationName: location
    }]
    capabilities: [{
      name: 'EnableServerless'
    }]
  }
}

// Production - Provisioned with Autoscale
resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2023-04-15' = {
  name: 'cosmos-${environmentName}-prod'
  location: location
  properties: {
    databaseAccountOfferType: 'Standard'
    locations: [
      {
        locationName: location
        failoverPriority: 0
      }
      {
        locationName: secondaryLocation
        failoverPriority: 1
      }
    ]
    enableAutomaticFailover: true
    enableMultipleWriteLocations: false
  }
}

resource cosmosDatabase 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases@2023-04-15' = {
  name: 'main'
  parent: cosmosAccount
  properties: {
    resource: {
      id: 'main'
    }
    options: {
      autoscaleSettings: {
        maxThroughput: 4000
      }
    }
  }
}

Azure Storage Account

Storage Account Types

  1. Standard_LRS - Development, non-critical data
  2. Standard_GRS - Production, geo-redundancy needed
  3. Premium_LRS - High-performance applications
  4. Premium_ZRS - High availability with zone redundancy

Performance Tiers

  • Standard: General purpose, cost-effective
  • Premium: High-performance, low-latency scenarios
// Development
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: 'sa${uniqueString(resourceGroup().id)}dev'
  location: location
  sku: {
    name: 'Standard_LRS'
  }
  kind: 'StorageV2'
  properties: {
    accessTier: 'Hot'
    allowBlobPublicAccess: false
    minimumTlsVersion: 'TLS1_2'
  }
}

// Production
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: 'sa${uniqueString(resourceGroup().id)}prod'
  location: location
  sku: {
    name: 'Standard_GRS'
  }
  kind: 'StorageV2'
  properties: {
    accessTier: 'Hot'
    allowBlobPublicAccess: false
    minimumTlsVersion: 'TLS1_2'
    networkAcls: {
      defaultAction: 'Deny'
      virtualNetworkRules: []
      ipRules: []
    }
  }
}

Cost Optimization Strategies

1. Reserved Capacity

Reserve resources for 1-3 years for significant discounts:

# Check reservation options
az reservations catalog show --reserved-resource-type SqlDatabase
az reservations catalog show --reserved-resource-type CosmosDb

2. Right-Sizing

Start with smaller SKUs and scale up based on actual usage:

# Progressive scaling approach
development:
  app_service: "F1"    # Free tier
testing:
  app_service: "B1"    # Basic tier  
staging:
  app_service: "S1"    # Standard tier
production:
  app_service: "P1V3"  # Premium tier

3. Auto-Scaling Configuration

Implement intelligent scaling to optimize costs:

resource autoScaleSettings 'Microsoft.Insights/autoscalesettings@2022-10-01' = {
  name: 'autoscale-${appServicePlan.name}'
  location: location
  properties: {
    profiles: [{
      name: 'default'
      capacity: {
        minimum: '1'
        maximum: '10'
        default: '2'
      }
      rules: [
        {
          metricTrigger: {
            metricName: 'CpuPercentage'
            metricResourceUri: appServicePlan.id
            operator: 'GreaterThan'
            threshold: 70
            timeAggregation: 'Average'
            timeGrain: 'PT1M'
            timeWindow: 'PT5M'
          }
          scaleAction: {
            direction: 'Increase'
            type: 'ChangeCount'
            value: '1'
            cooldown: 'PT5M'
          }
        }
        {
          metricTrigger: {
            metricName: 'CpuPercentage'
            metricResourceUri: appServicePlan.id
            operator: 'LessThan'
            threshold: 30
            timeAggregation: 'Average'
            timeGrain: 'PT1M'
            timeWindow: 'PT5M'
          }
          scaleAction: {
            direction: 'Decrease'
            type: 'ChangeCount'
            value: '1'
            cooldown: 'PT5M'
          }
        }
      ]
    }]
    enabled: true
    targetResourceUri: appServicePlan.id
  }
}

4. Scheduled Scaling

Scale down during off-hours:

{
  "profiles": [
    {
      "name": "business-hours",
      "capacity": {
        "minimum": "2",
        "maximum": "10", 
        "default": "3"
      },
      "recurrence": {
        "frequency": "Week",
        "schedule": {
          "timeZone": "Pacific Standard Time",
          "days": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
          "hours": [8],
          "minutes": [0]
        }
      }
    },
    {
      "name": "off-hours",
      "capacity": {
        "minimum": "1",
        "maximum": "2",
        "default": "1"
      },
      "recurrence": {
        "frequency": "Week", 
        "schedule": {
          "timeZone": "Pacific Standard Time",
          "days": ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"],
          "hours": [18],
          "minutes": [0]
        }
      }
    }
  ]
}

Performance Considerations

Baseline Performance Requirements

Define clear performance requirements before SKU selection:

performance_requirements:
  response_time:
    p95: "< 500ms"
    p99: "< 1000ms"
  throughput:
    requests_per_second: 1000
    concurrent_users: 500
  availability:
    uptime: "99.9%"
    rpo: "15 minutes"
    rto: "30 minutes"

Load Testing

Test different SKUs to validate performance:

# Azure Load Testing service
az load test create \
  --name "sku-performance-test" \
  --resource-group $RESOURCE_GROUP \
  --load-test-config @load-test-config.yaml

Monitoring and Optimization

Set up comprehensive monitoring:

resource applicationInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: 'ai-${environmentName}'
  location: location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    RetentionInDays: 90
  }
}

resource logAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2022-10-01' = {
  name: 'law-${environmentName}'
  location: location
  properties: {
    sku: {
      name: 'PerGB2018'
    }
    retentionInDays: 30
  }
}

Quick Reference Tables

App Service SKU Quick Reference

SKU Tier vCPU RAM Storage Price Range Use Case
F1 Free Shared 1GB 1GB Free Development
B1 Basic 1 1.75GB 10GB $ Small apps
S1 Standard 1 1.75GB 50GB $$ Production
P1V3 Premium V3 2 8GB 250GB $$$ High performance
I1V2 Isolated V2 2 8GB 1TB $$$$ Enterprise

SQL Database SKU Quick Reference

SKU Tier DTU/vCore Storage Price Range Use Case
Basic Basic 5 DTU 2GB $ Development
S2 Standard 50 DTU 250GB $$ Small production
P2 Premium 250 DTU 1TB $$$ High performance
GP_Gen5_4 General Purpose 4 vCore 4TB $$$ Balanced
BC_Gen5_8 Business Critical 8 vCore 4TB $$$$ Mission critical

Container Apps SKU Quick Reference

Model Pricing CPU/Memory Use Case
Consumption Pay-per-use 0.25-2 vCPU Development, variable load
Dedicated D4 Reserved 4 vCPU, 16GB Production
Dedicated D8 Reserved 8 vCPU, 32GB High performance

Validation Tools

SKU Availability Checker

#!/bin/bash
# Check SKU availability in target region

check_sku_availability() {
    local region=$1
    local resource_type=$2
    local sku=$3
    
    echo "Checking $sku availability for $resource_type in $region..."
    
    case $resource_type in
        "app-service")
            az appservice list-locations --sku $sku --output table
            ;;
        "sql-database")
            az sql db list-editions --location $region --output table
            ;;
        "storage")
            az storage account check-name --name "test" --output table
            ;;
        *)
            echo "Resource type not supported"
            ;;
    esac
}

# Usage
check_sku_availability "eastus" "app-service" "P1V3"

Cost Estimation Script

# PowerShell script for cost estimation
function Get-AzureCostEstimate {
    param(
        [string]$SubscriptionId,
        [string]$ResourceGroup,
        [hashtable]$Resources
    )
    
    $totalCost = 0
    
    foreach ($resource in $Resources.GetEnumerator()) {
        $resourceType = $resource.Key
        $sku = $resource.Value
        
        # Use Azure Pricing API or calculator
        $cost = Get-ResourceCost -Type $resourceType -SKU $sku
        $totalCost += $cost
        
        Write-Host "$resourceType ($sku): $cost/month"
    }
    
    Write-Host "Total estimated cost: $totalCost/month"
}

# Usage
$resources = @{
    "AppService" = "P1V3"
    "SqlDatabase" = "GP_Gen5_4"
    "StorageAccount" = "Standard_GRS"
}

Get-AzureCostEstimate -ResourceGroup "rg-myapp-prod" -Resources $resources

Performance Validation

# Load test configuration for SKU validation
test_configuration:
  duration: "10m"
  users:
    spawn_rate: 10
    max_users: 100
  
  scenarios:
    - name: "sku_performance_test"
      requests:
        - url: "https://myapp.azurewebsites.net/api/health"
          method: "GET"
          expect:
            - status_code: 200
            - response_time_ms: 500
        
        - url: "https://myapp.azurewebsites.net/api/data"
          method: "POST"
          expect:
            - status_code: 201
            - response_time_ms: 1000

  thresholds:
    http_req_duration:
      - "p(95)<500"  # 95% of requests under 500ms
      - "p(99)<1000" # 99% of requests under 1s
    http_req_failed:
      - "rate<0.1"   # Less than 10% failure rate

Best Practices Summary

Do's

  1. Start small and scale up based on actual usage
  2. Use different SKUs for different environments
  3. Monitor performance and costs continuously
  4. Leverage reserved capacity for production workloads
  5. Implement auto-scaling where appropriate
  6. Test performance with realistic workloads
  7. Plan for growth but avoid over-provisioning
  8. Use free tiers for development when possible

Don'ts

  1. Don't use production SKUs for development
  2. Don't ignore regional SKU availability
  3. Don't forget about data transfer costs
  4. Don't over-provision without justification
  5. Don't ignore the impact of dependencies
  6. Don't set auto-scaling limits too high
  7. Don't forget about compliance requirements
  8. Don't make decisions based on price alone

Pro Tip: Use Azure Cost Management and Advisor to get personalized recommendations for optimizing your SKU selections based on actual usage patterns.


Navigation