Skip to content

feat(sagemaker): add containerStartupHealthCheckTimeoutInSeconds support for EndpointConfig#35576

Closed
amandladev wants to merge 3 commits intoaws:mainfrom
amandladev:feature/add-container-startup-healthcheck-timeout
Closed

feat(sagemaker): add containerStartupHealthCheckTimeoutInSeconds support for EndpointConfig#35576
amandladev wants to merge 3 commits intoaws:mainfrom
amandladev:feature/add-container-startup-healthcheck-timeout

Conversation

@amandladev
Copy link
Copy Markdown
Contributor

Implements container startup health check timeout configuration for SageMaker endpoint production variants as available in CloudFormation but missing in CDK constructs.

Issue # 35566

  • Add containerStartupHealthCheckTimeout property to InstanceProductionVariantProps interface
  • Add comprehensive validation for timeout range (60-3600 seconds)
  • Add CloudFormation template generation for ContainerStartupHealthCheckTimeoutInSeconds property
  • Include test coverage for validation scenarios and edge cases
  • Update README documentation with usage examples and constraints

Reason for this change

AWS SageMaker EndpointConfig supports ContainerStartupHealthCheckTimeoutInSeconds in CloudFormation to configure health check timeout for inference containers, but this property is not exposed in the CDK SageMaker L2 constructs. Users with models that require longer initialization time cannot configure appropriate health check timeouts, leading to premature health check failures.

Description of changes

Implements AWS SageMaker container startup health check timeout support in CDK SageMaker L2 constructs, enabling users to configure appropriate health check timeouts for inference containers:

  • New containerStartupHealthCheckTimeout property in InstanceProductionVariantProps interface with AWS-compliant validation:
    • Range: 60-3600 seconds (1 minute to 1 hour)
    • Type: cdk.Duration for intuitive time specification
    • Optional property maintaining backward compatibility
  • Enhanced addInstanceProductionVariant() method with comprehensive input validation
  • Automatic conversion from cdk.Duration to seconds for CloudFormation compatibility
  • Synthesis-time validation with clear, actionable error messages
  • CloudFormation integration mapping to ContainerStartupHealthCheckTimeoutInSeconds property

Usage Example:

import * as cdk from 'aws-cdk-lib';
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const model: sagemaker.IModel;

// Create endpoint configuration with health check timeout
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
  instanceProductionVariants: [{
    variantName: 'my-variant',
    model: model,
    containerStartupHealthCheckTimeout: cdk.Duration.minutes(5), // 5 minutes timeout
  }],
});

Describe any new or updated permissions being added

N/A - No new IAM permissions required. Leverages existing SageMaker endpoint configuration permissions.

Description of how you validated changes

Unit tests: Added 5 comprehensive container startup health check timeout tests covering all validation scenarios:

  • Property inclusion in CloudFormation template when provided
  • Property absence in CloudFormation template when not provided
  • Range validation for minimum value (60 seconds)
  • Range validation for maximum value (3600 seconds)
  • Acceptance of valid timeout values at boundaries
  • Duration to seconds conversion verification

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@github-actions github-actions bot added the beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK label Sep 25, 2025
@aws-cdk-automation aws-cdk-automation requested a review from a team September 25, 2025 16:37
@github-actions github-actions bot added the p2 label Sep 25, 2025
Copy link
Copy Markdown
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This review is outdated)

@aws-cdk-automation aws-cdk-automation added the pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback. label Sep 25, 2025
@badmintoncryer
Copy link
Copy Markdown
Contributor

@amandladev Thank you for your PR. Because this is a new feature request, you need to execute integ tests.

@aws-cdk-automation aws-cdk-automation dismissed their stale review September 29, 2025 17:34

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@aws-cdk-automation aws-cdk-automation removed the pr-linter/exemption-requested The contributor has requested an exemption to the PR Linter feedback. label Sep 29, 2025
@amandladev amandladev closed this Sep 30, 2025
@github-actions
Copy link
Copy Markdown
Contributor

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 30, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK p2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants