fix(s3-notifications): handle NoSuchBucket errors gracefully during stack deletion#35481
fix(s3-notifications): handle NoSuchBucket errors gracefully during stack deletion#35481oyiz-michael wants to merge 16 commits intoaws:mainfrom
Conversation
…ndler Fixes issue aws#35352 where S3 bucket notifications caused stack deletion failures due to NoSuchBucket errors when the bucket was deleted before the notification cleanup. The fix: - Gracefully handles NoSuchBucket errors during delete operations - Still properly fails on create/update operations when bucket missing - Adds comprehensive test coverage for all NoSuchBucket scenarios This resolves CloudFormation dependency ordering issues where the bucket gets deleted before the notification custom resource cleanup.
This integration test validates that S3 bucket notifications work correctly with different RemovalPolicy settings and that stack deletion succeeds even when bucket deletion occurs before notification cleanup. The test creates: - Bucket with RemovalPolicy.RETAIN + notification (reproduces issue aws#35352) - Bucket with RemovalPolicy.DESTROY + notification (existing behavior) This provides the required integration test coverage for the NoSuchBucket error handling fix in the custom resource handler.
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
|
Thank you for your contribution! Can you give me a little more information about which types of s3-notifications you're experiencing this with? Obviously this change uses SNS but I'm curious if you tested this or ran into the same issue with the others. I think that the correct solution here is to add/fix the dependency between the two resources instead of manual handling for failure scenarios but I saw that some of the s3 notification types do already establish a dependency so I'm wondering if there is an issue for the specific type and how it establishes a dependency or if they're all faulty. That will help identify where the root cause of the bug actually is. |
Thanks for reviewing this PR! TopicConfigurations (SNS) Dependency management vs error handling: While I understand the appeal of fixing dependencies, the error handling approach is more robust here because: Race conditions are inherent - Even with perfect dependency ordering, buckets can be deleted externally (manual deletion, other stacks, CLI operations) The current approach follows CloudFormation best practices: treat "resource already gone" during DELETE as success (desired end state achieved), while properly failing CREATE/UPDATE when resources don't exist. Root cause: This isn't really a dependency bug - it's the fundamental challenge that CloudFormation processes deletions in ways that can create race conditions, which error handling addresses more comprehensively than dependency tweaking. |
|
|
||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||
… Security Guardian compliance
Configure the bucket notification removal policy integration test to allow destruction of AWS::CDK::Metadata resource during stack updates. This is expected for new integration tests and prevents CI failures when CDK metadata changes between test runs.
…s for security compliance
|
This PR has been in the MERGE CONFLICTS state for 3 weeks, and looks abandoned. Note that PRs with failing linting check or builds are not reviewed, please ensure your build is passing To prevent automatic closure:
This PR will automatically close in 14 days if no action is taken. |
|
Comments on closed issues and PRs are hard for our team to see. |
Issue # (if applicable)
Closes #35352.
Reason for this change
When using bucket.addEventNotification() to configure S3 bucket notifications, CloudFormation stack deletion fails with a NoSuchBucket error. This happens because the custom resource that manages notifications tries to clear them AFTER CloudFormation has already processed the bucket deletion (even with RemovalPolicy.RETAIN).
This makes it extremely difficult to reliably delete stacks containing S3 buckets with event notifications without resorting to complex workarounds, especially problematic in CI/CD environments where automated cleanup is essential.
Description of changes
Root Cause: The custom resource handler fails hard when encountering NoSuchBucket errors during the deletion process, regardless of whether the bucket was already deleted by CloudFormation.
Solution: Enhanced the Python custom resource handler with proper error handling:
Enhanced Exception Handling:
Added ClientError import and proper error code checking
Gracefully handle NoSuchBucket errors during Delete operations - treats missing bucket as successful deletion (desired end state)
Maintain proper error propagation for Create/Update operations - still fails appropriately when bucket doesn't exist
Dual Error Handling Points:
get_bucket_notification_configuration: Handle bucket deletion before reading existing notifications
put_bucket_notification_configuration: Handle bucket deletion between read and write operations
Operation-Specific Behavior:
Delete operations: Missing bucket = Success (desired end state achieved)
Create/Update operations: Missing bucket = Failure (actual error condition)
Files Modified:
index.py - Core fix
test_index.py - Comprehensive test coverage
Alternative Approaches Considered:
Dependency ordering fixes: CloudFormation doesn't guarantee execution order even with explicit dependencies
Bucket existence checks: Race conditions still possible between check and operation
RetryPolicy: Doesn't address the fundamental ordering issue
Describe any new or updated permissions being added
No new IAM permissions required. The existing permissions (s3:PutBucketNotification, s3:GetBucketNotification) remain unchanged.
Description of how you validated changes
Unit Tests: Added 5 comprehensive test cases covering all NoSuchBucket scenarios:
test_delete_unmanaged_bucket_not_found_on_get_notification - Bucket deleted before get operation
test_delete_unmanaged_bucket_not_found_on_put_notification - Bucket deleted between get/put operations
test_delete_managed_bucket_not_found - Managed bucket deletion scenarios
test_create_bucket_not_found_fails - Ensures Create operations still fail appropriately
test_update_bucket_not_found_fails - Ensures Update operations still fail appropriately
Test Results: All 27 tests pass (22 existing + 5 new), confirming no regressions
Build Validation: Successfully built aws-cdk-lib confirming no compilation issues
Real-world Scenario Testing: The fix addresses the exact error scenario described in the issue where bucket notifications cause stack deletion failures
Checklist
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license