Skip to content

fix(s3-notifications): handle NoSuchBucket errors gracefully during stack deletion#35481

Closed
oyiz-michael wants to merge 16 commits intoaws:mainfrom
oyiz-michael:fix/s3-bucket-notification-delete-nosuchbucket
Closed

fix(s3-notifications): handle NoSuchBucket errors gracefully during stack deletion#35481
oyiz-michael wants to merge 16 commits intoaws:mainfrom
oyiz-michael:fix/s3-bucket-notification-delete-nosuchbucket

Conversation

@oyiz-michael
Copy link
Copy Markdown
Contributor

Issue # (if applicable)

Closes #35352.

Reason for this change

When using bucket.addEventNotification() to configure S3 bucket notifications, CloudFormation stack deletion fails with a NoSuchBucket error. This happens because the custom resource that manages notifications tries to clear them AFTER CloudFormation has already processed the bucket deletion (even with RemovalPolicy.RETAIN).

This makes it extremely difficult to reliably delete stacks containing S3 buckets with event notifications without resorting to complex workarounds, especially problematic in CI/CD environments where automated cleanup is essential.

Description of changes

Root Cause: The custom resource handler fails hard when encountering NoSuchBucket errors during the deletion process, regardless of whether the bucket was already deleted by CloudFormation.

Solution: Enhanced the Python custom resource handler with proper error handling:

Enhanced Exception Handling:
Added ClientError import and proper error code checking
Gracefully handle NoSuchBucket errors during Delete operations - treats missing bucket as successful deletion (desired end state)
Maintain proper error propagation for Create/Update operations - still fails appropriately when bucket doesn't exist

Dual Error Handling Points:
get_bucket_notification_configuration: Handle bucket deletion before reading existing notifications
put_bucket_notification_configuration: Handle bucket deletion between read and write operations

Operation-Specific Behavior:
Delete operations: Missing bucket = Success (desired end state achieved)
Create/Update operations: Missing bucket = Failure (actual error condition)

Files Modified:
index.py - Core fix
test_index.py - Comprehensive test coverage

Alternative Approaches Considered:
Dependency ordering fixes: CloudFormation doesn't guarantee execution order even with explicit dependencies
Bucket existence checks: Race conditions still possible between check and operation
RetryPolicy: Doesn't address the fundamental ordering issue

Describe any new or updated permissions being added

No new IAM permissions required. The existing permissions (s3:PutBucketNotification, s3:GetBucketNotification) remain unchanged.

Description of how you validated changes

Unit Tests: Added 5 comprehensive test cases covering all NoSuchBucket scenarios:

test_delete_unmanaged_bucket_not_found_on_get_notification - Bucket deleted before get operation
test_delete_unmanaged_bucket_not_found_on_put_notification - Bucket deleted between get/put operations
test_delete_managed_bucket_not_found - Managed bucket deletion scenarios
test_create_bucket_not_found_fails - Ensures Create operations still fail appropriately
test_update_bucket_not_found_fails - Ensures Update operations still fail appropriately

Test Results: All 27 tests pass (22 existing + 5 new), confirming no regressions

Build Validation: Successfully built aws-cdk-lib confirming no compilation issues

Real-world Scenario Testing: The fix addresses the exact error scenario described in the issue where bucket notifications cause stack deletion failures

Checklist


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

…ndler

Fixes issue aws#35352 where S3 bucket notifications caused stack deletion
failures due to NoSuchBucket errors when the bucket was deleted before
the notification cleanup.

The fix:
- Gracefully handles NoSuchBucket errors during delete operations
- Still properly fails on create/update operations when bucket missing
- Adds comprehensive test coverage for all NoSuchBucket scenarios

This resolves CloudFormation dependency ordering issues where the
bucket gets deleted before the notification custom resource cleanup.
@github-actions github-actions bot added bug This issue is a bug. effort/medium Medium work item – several days of effort p2 beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK labels Sep 13, 2025
@aws-cdk-automation aws-cdk-automation requested a review from a team September 13, 2025 16:47
Copy link
Copy Markdown
Collaborator

@aws-cdk-automation aws-cdk-automation left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This review is outdated)

This integration test validates that S3 bucket notifications work correctly
with different RemovalPolicy settings and that stack deletion succeeds
even when bucket deletion occurs before notification cleanup.

The test creates:
- Bucket with RemovalPolicy.RETAIN + notification (reproduces issue aws#35352)
- Bucket with RemovalPolicy.DESTROY + notification (existing behavior)

This provides the required integration test coverage for the NoSuchBucket
error handling fix in the custom resource handler.
@aws-cdk-automation aws-cdk-automation dismissed their stale review September 13, 2025 17:13

✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.

@TheRealAmazonKendra
Copy link
Copy Markdown
Contributor

TheRealAmazonKendra commented Sep 18, 2025

Thank you for your contribution! Can you give me a little more information about which types of s3-notifications you're experiencing this with? Obviously this change uses SNS but I'm curious if you tested this or ran into the same issue with the others.

I think that the correct solution here is to add/fix the dependency between the two resources instead of manual handling for failure scenarios but I saw that some of the s3 notification types do already establish a dependency so I'm wondering if there is an issue for the specific type and how it establishes a dependency or if they're all faulty. That will help identify where the root cause of the bug actually is.

@oyiz-michael
Copy link
Copy Markdown
Contributor Author

Thank you for your contribution! Can you give me a little more information about which types of s3-notifications you're experiencing this with? Obviously this change uses SNS but I'm curious if you tested this or ran into the same issue with the others.

I think that the correct solution here is to add/fix the dependency between the two resources instead of manual handling for failure scenarios but I saw that some of the s3 notification types do already establish a dependency so I'm wondering if there is an issue for the specific type and how it establishes a dependency or if they're all faulty. That will help identify where the root cause of the bug actually is.

Thanks for reviewing this PR!
This issue affects all three S3 notification types:

TopicConfigurations (SNS)
QueueConfigurations (SQS)
LambdaFunctionConfigurations (Lambda)
The test suite covers all three types with 12 comprehensive test cases covering managed/unmanaged scenarios and different CloudFormation operations.

Dependency management vs error handling: While I understand the appeal of fixing dependencies, the error handling approach is more robust here because:

Race conditions are inherent - Even with perfect dependency ordering, buckets can be deleted externally (manual deletion, other stacks, CLI operations)
Cross-stack complexity - Buckets and notifications are often in different stacks
External bucket references - Many users reference existing buckets not managed by CDK
Real-world resilience - Handles scenarios dependency management can't

The current approach follows CloudFormation best practices: treat "resource already gone" during DELETE as success (desired end state achieved), while properly failing CREATE/UPDATE when resources don't exist.

Root cause: This isn't really a dependency bug - it's the fundamental challenge that CloudFormation processes deletions in ways that can create race conditions, which error handling addresses more comprehensively than dependency tweaking.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 20, 2025

⚠️ Experimental Feature: This security report is currently in experimental phase. Results may include false positives and the rules are being actively refined.
Please try merge from main to avoid findings unrelated to the PR.


TestsPassed ☑️SkippedFailed ❌️
Security Guardian Results408 ran403 passed5 failed
TestResult
Security Guardian Results
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sns/integ.encrypted-sns-bucket-notifications.js.snapshot/SnsBucketNotificationsStack.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications-sorting.js.snapshot/sns-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications-unmanaged.js.snapshot/integ-sqs-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications.js.snapshot/sqs-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3/test/integ.bucket.notifications-scoped-permissions.js.snapshot/aws-cdk-s3-notifications-scoped-permissions.template.json
sqs-encryption-enabled.guard❌ failure

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Dec 20, 2025

⚠️ Experimental Feature: This security report is currently in experimental phase. Results may include false positives and the rules are being actively refined.
Please try merge from main to avoid findings unrelated to the PR.


TestsPassed ☑️SkippedFailed ❌️
Security Guardian Results with resolved templates408 ran403 passed5 failed
TestResult
Security Guardian Results with resolved templates
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sns/integ.encrypted-sns-bucket-notifications.js.snapshot/SnsBucketNotificationsStack.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications-sorting.js.snapshot/sns-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications-unmanaged.js.snapshot/integ-sqs-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3-notifications/test/sqs/integ.bucket-notifications.js.snapshot/sqs-bucket-notifications.template.json
sqs-encryption-enabled.guard❌ failure
packages/@aws-cdk-testing/framework-integ/test/aws-s3/test/integ.bucket.notifications-scoped-permissions.js.snapshot/aws-cdk-s3-notifications-scoped-permissions.template.json
sqs-encryption-enabled.guard❌ failure

oyiz-michael and others added 7 commits December 20, 2025 01:09
Configure the bucket notification removal policy integration test to allow
destruction of AWS::CDK::Metadata resource during stack updates. This is
expected for new integration tests and prevents CI failures when CDK
metadata changes between test runs.
@aws-cdk-automation
Copy link
Copy Markdown
Collaborator

This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. Note that PRs with failing linting check or builds are not reviewed, please ensure your build is passing

To prevent automatic closure:

  • Resume work on the PR
  • OR request an exemption by adding a comment containing 'Exemption Request' with justification e.x "Exemption Request: "
  • OR request clarification by adding a comment containing 'Clarification Request' with a question e.x "Clarification Request: "

This PR will automatically close in 14 days if no action is taken.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 9, 2026

Comments on closed issues and PRs are hard for our team to see.
If you need help, please open a new issue that references this one.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 9, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

beginning-contributor [Pilot] contributed between 0-2 PRs to the CDK bug This issue is a bug. effort/medium Medium work item – several days of effort p2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

addEventNotification causes "NoSuchBucket" errors during stack deletion regardless of bucket RemovalPolicy

4 participants