Skip to content

fix: MNG cluster datasource errors#1639

Merged
antonbabenko merged 1 commit into
terraform-aws-modules:masterfrom
stevehipwell:mng-cluster-data-fix
Oct 14, 2021
Merged

fix: MNG cluster datasource errors#1639
antonbabenko merged 1 commit into
terraform-aws-modules:masterfrom
stevehipwell:mng-cluster-data-fix

Conversation

@stevehipwell

Copy link
Copy Markdown
Contributor

PR o'clock

Description

This PR replaces the cluster lookup data source in the node_groups module with variables being passed in.

Apologies for the issues with the current implementation, I'm still unsure as to why it fails when it does. With hindsight and re-reading the existing module code this implementation is a better fit.

Fixes #1635.
Closes #1636.
Closes #1638.

Checklist

@stevehipwell

Copy link
Copy Markdown
Contributor Author

/assign @daroga0002

Signed-off-by: Steve Hipwell <steve.hipwell@gmail.com>

@daroga0002 daroga0002 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevehipwell I have few thoughts here about better design. DO you will be able to review it and propose something as per comments?

variable "cluster_auth_base64" {
description = "Base64 encoded CA of parent cluster"
type = string
default = ""

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot leave this empty as there can be risk that we will pass empty values to template what in general will cause that bootstrap script wil lbe crashing

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only ever be empty if the sub module is called directly, is this supported? The code is fine even if it's empty as bootstrap.sh will fall back to fetching this. If we set it then there is no cost at node start (this can add up on large clusters), but if it's not set every time anode starts it needs to query this information.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope, it is not supported (as of now)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the cluster_auth_base64 & cluster_endpoint variables will always be set unless create_eks = false, in which case they're not going to be used. This is the same logic as is used for cluster_name which is also optional (and unlike these variables would break bootstrap.sh if not set).

description = "Endpoint of parent cluster"
type = string
default = ""
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot leave this empty as there can be risk that we will pass empty values to template what in general will cause that bootstrap script wil lbe crashing

@daroga0002 daroga0002 Oct 13, 2021

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably this should be somehow conditional input which is required only when we dont use eks optimized AMI (as bootstrap.sh can download it by own).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, the code is correctly optimised for the 80+% case where this will be set, if it's not set bootstrap.sh will fall back to fetching it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

checked bootstrap.sh and seems it is able to cover empty variable scenario

B64_CLUSTER_CA=${cluster_auth_base64}
KUBELET_EXTRA_ARGS='--node-labels=eks.amazonaws.com/nodegroup-image=${ami_id},eks.amazonaws.com/capacityType=${capacity_type}${append_labels} ${kubelet_extra_args}'
%{endif ~}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we using a EKS optimized AMI then we can just rely on bootstrap.sh key and dont require to pass

--apiserver-endpoint "$${API_SERVER_URL}" --b64-cluster-ca "$${B64_CLUSTER_CA}"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is designed to set the variables once for all non-merging userdata. It also replicates the output of the merged userdata for the --apiserver-endpoint & --b64-cluster-ca arguments.

@daroga0002 daroga0002 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reviewed and tested, this should solve our current issue and in future it should be also not blocking anything or cause issues.

@antonbabenko lets merge this and make a release

@antonbabenko antonbabenko merged commit 7c33554 into terraform-aws-modules:master Oct 14, 2021
@antonbabenko

Copy link
Copy Markdown
Member

Thanks @daroga0002 !

v17.22.0 has been just released.

@github-actions

Copy link
Copy Markdown

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Nov 12, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot Create a New Cluster w/ 17.21.0

3 participants