Skip to content

fix: Fix Launch Templates error with aws 2.61.0#875

Merged
barryib merged 1 commit into
terraform-aws-modules:masterfrom
dpiddock:fix/874
May 9, 2020
Merged

fix: Fix Launch Templates error with aws 2.61.0#875
barryib merged 1 commit into
terraform-aws-modules:masterfrom
dpiddock:fix/874

Conversation

@dpiddockcmp

Copy link
Copy Markdown
Contributor

PR o'clock

Description

Updating Launch Templates with aws provider 2.61.0 generates an error when creating the ASG or attempting to launch new instances. This is caused by the aws provider adding support to the launch template for placement.partition_number. I think the root issue is go-aws-sdk defaulting this to 0 as it's an integer and the module setting the placement block when no placement group is in use. Having partition_number set but no placement group name is a validation error. For example when creating a new ASG:
Error: Error creating AutoScaling Group: ValidationError: You must use a valid fully-formed launch template. You cannot use PartitionNumber with a Placement Group that does not exist. Specify a valid Placement Group and try again.

Resolve the issue by not setting the placement block if no placement group name is passed in.

This will cause an update for all users using launch templates unless they are using placement groups.

Fixes #874

Checklist

@riveryc

riveryc commented May 9, 2020

Copy link
Copy Markdown

NICE, this fixed my issue!

@dpiddockcmp

Copy link
Copy Markdown
Contributor Author

I've done a little more testing and this fixes launch templates for people who do not use placement groups.

However there's a bug in the provider or SDK. Placement groups specified directly in the launch template only work if the placement group is of type partition because partition_number is set. Except we don't expose the ability to set the partition number. I don't want to do that as it will require a bump to the broken provider version 2.61.0.

Non-partition placement groups can still be specified on the ASG via placement_group variable.

module "eks" {
  # ...
  worker_groups_launch_template = [
    {
      name                 = "worker-group-1"
      instance_type        = "m5.large"
      asg_desired_capacity = 1
      public_ip            = true
      placement_group      = aws_placement_group.this.id
    },
    {
      name                            = "worker-group-2"
      instance_type                   = "m5.large"
      asg_desired_capacity            = 1
      public_ip                       = false
      launch_template_placement_group = aws_placement_group.this.id
    },
    {
      name                 = "worker-group-3"
      instance_type        = "t2.small"
      asg_desired_capacity = 1
      public_ip            = false
    },
  ]
}
resource "aws_placement_group" "this" {
  name     = "testgroup"
  strategy = "cluster"
}

worker-group-1 and worker-group-3 are created without issue. worker-group-2 fails with:

Error: Error creating AutoScaling Group: ValidationError: You must use a valid
fully-formed launch template. You cannot use PartitionNumber with a Placement
Group that does not use the 'partition' strategy. Placement Group 'testgroup'
uses 'cluster' strategy. Specify a different Placement Group and try again.

I guess users wanting to set a non-partition placement group on the launch template will have to pin to aws 2.60.0 until it's fixed?

@barryib

barryib commented May 9, 2020

Copy link
Copy Markdown
Member

I guess users wanting to set a non-partition placement group on the launch template will have to pin to aws 2.60.0 until it's fixed?

Can we just pin this in versions.tf ? Something like aws = ">= 2.52.0,<= 2.60.0" or aws = "~> 2.60.0" until it's fixed upstream ?

@dpiddockcmp

dpiddockcmp commented May 9, 2020

Copy link
Copy Markdown
Contributor Author

We could but I don't feel that it is the correct solution in this situation. The issue is a provider/sdk bug. By pinning to < 2.61 we stop users from being able to use future fixed versions. There are other fixes and features in the provider.

The remaining issue only impacts users using placement groups. They can either pin the provider to 2.60.0 or pass in placement_group instead of launch_template_placement_group.

We can have a future release to support the new placement groups with partitions once the provider is fixed. Looks like there is already a PR for the provider issue hashicorp/terraform-provider-aws#13236

@barryib

barryib commented May 9, 2020

Copy link
Copy Markdown
Member

Agree in general. But it's not a definitive pinning. I was thinking to pin the module version temporally, like we already did with the kubernetes provider.

Furthermore, we should consider bumping the aws provider version. Maybe after hashicorp/terraform-provider-aws#13239 is merged ?

@barryib

barryib commented May 9, 2020

Copy link
Copy Markdown
Member

@dpiddockcmp do you still want to merge this PR ?

@kmgreen2

kmgreen2 commented May 9, 2020

Copy link
Copy Markdown

Thanks! Worked for me as well!

@dpiddockcmp

Copy link
Copy Markdown
Contributor Author

Kubernetes provider was slightly different as it broke default functionality used by probably the majority. At the time it was also the "expected behaviour" from the provider maintainer.

This PR fixes the issue for probably the majority of launch template users. There's a PR on the provider that fixes the real issue. Hopefully it gets released in the normal cycle on Thursday.

I think we should merge this and get 12.0.0 released.

@barryib barryib changed the title fix: Launch Templates error with aws 2.61.0 fix: Fix Launch Templates error with aws 2.61.0 May 9, 2020
@barryib barryib merged commit bb822a1 into terraform-aws-modules:master May 9, 2020
@murty0

murty0 commented May 13, 2020

Copy link
Copy Markdown

I am experiencing the same issue with aws provider version "~> 2.59" and eks module version "~> 10.0".

We already have a cluster running in production which did not have a placement group, but is working fine. I am in the midst of creating a testing cluster, with the above mentioned provider versions, and again with no placement groups defined. But I am getting the following errors:

Error: Error creating AutoScaling Group: ValidationError: You must use a valid fully-formed launch template. You cannot use PartitionNumber with a Placement Group that does not exist. Specify a valid Placement Group and try again.
  status code: 400, request id: 8760f1ce-69ea-4457-83a3-37eee9e11804

  on .terraform/modules/eks_cluster/terraform-aws-eks-10.0.0/workers_launch_template.tf line 3, in resource "aws_autoscaling_group" "workers_launch_template":
   3: resource "aws_autoscaling_group" "workers_launch_template" {

I have already posted this here: hashicorp/terraform-provider-aws#13236

@dpiddockcmp dpiddockcmp deleted the fix/874 branch May 13, 2020 17:17
@github-actions

Copy link
Copy Markdown

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Nov 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ASG Failed scale up after update aws_launch_template with aws provider 2.61.0 (currently latest version)

6 participants