terraform-aws-modules · antonbabenko · Apr 3, 2022 · Apr 3, 2022 · Apr 3, 2022 · Apr 3, 2022
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
diff --git a/README.md b/README.md
diff --git a/docs/README.md b/docs/README.md
@@ -0,0 +1,12 @@
+# Documentation
+
+## Table of Contents
+
+- [Frequently Asked Questions](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md)
+- [Compute Resources](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/compute_resources.md)
+- [IRSA Integration](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/irsa-integration.md)
+- [User Data](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/user_data.md)
+- [Network Connectivity](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/network_connectivity.md)
+- Upgrade Guides
+  - [Upgrade to v17.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-17.0.md)
+  - [Upgrade to v18.x](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/UPGRADE-18.0.md)
diff --git a/.github/UPGRADE-17.0.md → docs/UPGRADE-17.0.md b/.github/UPGRADE-17.0.md → docs/UPGRADE-17.0.md
diff --git a/UPGRADE-18.0.md → docs/UPGRADE-18.0.md b/UPGRADE-18.0.md → docs/UPGRADE-18.0.md
@@ -2,6 +2,8 @@
 
 Please consult the `examples` directory for reference example configurations. If you find a bug, please open an issue with supporting configuration to reproduce.
 
+Note: please see https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1744 where users have shared their steps/information for their individual configurations. Due to the numerous configuration possibilities, it is difficult to capture specific steps that will work for all and this has been a very helpful issue for others to share they were able to upgrade.
+
 ## List of backwards incompatible changes
 
 - Launch configuration support has been removed and only launch template is supported going forward. AWS is no longer adding new features back into launch configuration and their docs state [`We strongly recommend that you do not use launch configurations. They do not provide full functionality for Amazon EC2 Auto Scaling or Amazon EC2. We provide information about launch configurations for customers who have not yet migrated from launch configurations to launch templates.`](https://docs.aws.amazon.com/autoscaling/ec2/userguide/LaunchConfiguration.html)

diff --git a/docs/compute_resourcs.md b/docs/compute_resourcs.md
@@ -0,0 +1,209 @@
+# Compute Resources
+
+## Table of Contents
+
+- [EKS Managed Node Groups](https://github.com/terraform-aws-module/terraform-aws-eks/blob/master/docs/node_groups.md#eks-managed-node-groups)
+- [Self Managed Node Groups](https://github.com/terraform-aws-module/terraform-aws-eks/blob/master/docs/node_groups.md#self-managed-node-groups)
+- [Fargate Profiles](https://github.com/terraform-aws-module/terraform-aws-eks/blob/master/docs/node_groups.md#fargate-profiles)
+- [Default Configurations](https://github.com/terraform-aws-module/terraform-aws-eks/blob/master/docs/node_groups.md#default-configurations)
+
+ℹ️ Only the pertinent attributes are shown below for brevity
+
+### EKS Managed Node Groups
+
+Refer to the [EKS Managed Node Group documentation](https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html) documentation for service related details.
+
+1. The module creates a custom launch template by default to ensure settings such as tags are propagated to instances. To use the default template provided by the AWS EKS managed node group service, disable the launch template creation and set the `launch_template_name` to an empty string:
+
+```hcl
+  eks_managed_node_groups = {
+    default = {
+      create_launch_template = false
+      launch_template_name   = ""
+    }
+  }
+```
+
+2. Native support for Bottlerocket OS is provided by providing the respective AMI type:
+
+```hcl
+  eks_managed_node_groups = {
+    bottlerocket_default = {
+      create_launch_template = false
+      launch_template_name   = ""
+
+      ami_type = "BOTTLEROCKET_x86_64"
+      platform = "bottlerocket"
+    }
+  }
+```
+
+3. Users have limited support to extend the user data that is pre-pended to the user data provided by the AWS EKS Managed Node Group service:
+
+```hcl
+  eks_managed_node_groups = {
+    prepend_userdata = {
+      # See issue https://github.com/awslabs/amazon-eks-ami/issues/844
+      pre_bootstrap_user_data = <<-EOT
+      #!/bin/bash
+      set -ex
+      cat <<-EOF > /etc/profile.d/bootstrap.sh
+      export CONTAINER_RUNTIME="containerd"
+      export USE_MAX_PODS=false
+      export KUBELET_EXTRA_ARGS="--max-pods=110"
+      EOF
+      # Source extra environment variables in bootstrap script
+      sed -i '/^set -o errexit/a\\nsource /etc/profile.d/bootstrap.sh' /etc/eks/bootstrap.sh
+      EOT
+    }
+  }
+```
+
+4. Bottlerocket OS is supported in a similar manner. However, note that the user data for Bottlerocket OS uses the TOML format:
+
+```hcl
+  eks_managed_node_groups = {
+    bottlerocket_prepend_userdata = {
+      ami_type = "BOTTLEROCKET_x86_64"
+      platform = "bottlerocket"
+
+      bootstrap_extra_args = <<-EOT
+      # extra args added
+      [settings.kernel]
+      lockdown = "integrity"
+      EOT
+    }
+  }
+```
+
+5. When using a custom AMI, the AWS EKS Managed Node Group service will NOT inject the necessary bootstrap script into the supplied user data. Users can elect to provide their own user data to bootstrap and connect or opt in to use the module provided user data:
+
+```hcl
+  eks_managed_node_groups = {
+    custom_ami = {
+      ami_id = "ami-0caf35bc73450c396"
+
+      # By default, EKS managed node groups will not append bootstrap script;
+      # this adds it back in using the default template provided by the module
+      # Note: this assumes the AMI provided is an EKS optimized AMI derivative
+      enable_bootstrap_user_data = true
+
+      bootstrap_extra_args = "--container-runtime containerd --kubelet-extra-args '--max-pods=20'"
+
+      pre_bootstrap_user_data = <<-EOT
+        export CONTAINER_RUNTIME="containerd"
+        export USE_MAX_PODS=false
+      EOT
+
+      # Because we have full control over the user data supplied, we can also run additional
+      # scripts/configuration changes after the bootstrap script has been run
+      post_bootstrap_user_data = <<-EOT
+        echo "you are free little kubelet!"
+      EOT
+    }
+  }
+```
+
+6. There is similar support for Bottlerocket OS:
+
+```hcl
+  eks_managed_node_groups = {
+    bottlerocket_custom_ami = {
+      ami_id   = "ami-0ff61e0bcfc81dc94"
+      platform = "bottlerocket"
+
+      # use module user data template to bootstrap
+      enable_bootstrap_user_data = true
+      # this will get added to the template
+      bootstrap_extra_args = <<-EOT
+      # extra args added
+      [settings.kernel]
+      lockdown = "integrity"
+
+      [settings.kubernetes.node-labels]
+      "label1" = "foo"
+      "label2" = "bar"
+
+      [settings.kubernetes.node-taints]
+      "dedicated" = "experimental:PreferNoSchedule"
+      "special" = "true:NoSchedule"
+      EOT
+    }
+  }
+```
+
+See the [`examples/eks_managed_node_group/` example](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/eks_managed_node_group) for a working example of various configurations.
+
+### Self Managed Node Groups
+
+Refer to the [Self Managed Node Group documentation](https://docs.aws.amazon.com/eks/latest/userguide/worker.html) documentation for service related details.
+
+1. The `self-managed-node-group` uses the latest AWS EKS Optimized AMI (Linux) for the given Kubernetes version by default:
+
+```hcl
+  cluster_version = "1.21"
+
+  # This self managed node group will use the latest AWS EKS Optimized AMI for Kubernetes 1.21
+  self_managed_node_groups = {
+    default = {}
+  }
+```
+
+2. To use Bottlerocket, specify the `platform` as `bottlerocket` and supply a Bottlerocket OS AMI:
+
+```hcl
+  cluster_version = "1.21"
+
+  self_managed_node_groups = {
+    bottlerocket = {
+      platform = "bottlerocket"
+      ami_id   = data.aws_ami.bottlerocket_ami.id
+    }
+  }
+```
+
+See the [`examples/self_managed_node_group/` example](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/self_managed_node_group) for a working example of various configurations.
+
+### Fargate Profiles
+
+Fargate profiles are straightforward to use and therefore no further details are provided here. See the [`examples/fargate_profile/` example](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/fargate_profile) for a working example of various configurations.
+
+### Default Configurations
+
+Each type of compute resource (EKS managed node group, self managed node group, or Fargate profile) provides the option for users to specify a default configuration. These default configurations can be overridden from within the compute resource's individual definition. The order of precedence for configurations (from highest to least precedence):
+
+- Compute resource individual configuration
+  - Compute resource family default configuration (`eks_managed_node_group_defaults`, `self_managed_node_group_defaults`, `fargate_profile_defaults`)
+    - Module default configuration (see `variables.tf` and `node_groups.tf`)
+
+For example, the following creates 4 AWS EKS Managed Node Groups:
+
+```hcl
+  eks_managed_node_group_defaults = {
+    ami_type               = "AL2_x86_64"
+    disk_size              = 50
+    instance_types         = ["m6i.large", "m5.large", "m5n.large", "m5zn.large"]
+  }
+
+  eks_managed_node_groups = {
+    # Uses module default configurations overridden by configuration above
+    default = {}
+
+    # This further overrides the instance types used
+    compute = {
+      instance_types = ["c5.large", "c6i.large", "c6d.large"]
+    }
+
+    # This further overrides the instance types and disk size used
+    persistent = {
+      disk_size = 1024
+      instance_types = ["r5.xlarge", "r6i.xlarge", "r5b.xlarge"]
+    }
+
+    # This overrides the OS used
+    bottlerocket = {
+      ami_type = "BOTTLEROCKET_x86_64"
+      platform = "bottlerocket"
+    }
+  }
+```
diff --git a/docs/faq.md b/docs/faq.md
@@ -0,0 +1,110 @@
+# Frequently Asked Questions
+
+- [How do I manage the `aws-auth` configmap?](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#how-do-i-manage-the-aws-auth-configmap)
+- [I received an error: `Error: Invalid for_each argument ...`](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#i-received-an-error-error-invalid-for_each-argument-)
+- [Why are nodes not being registered?](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#why-are-nodes-not-being-registered)
+- [Why are there no changes when a node group's `desired_size` is modified?](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#why-are-there-no-changes-when-a-node-groups-desired_size-is-modified)
+- [How can I deploy Windows based nodes?](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#how-can-i-deploy-windows-based-nodes)
+- [How do I access compute resource attributes?](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/faq.md#how-do-i-access-compute-resource-attributes)
+
+### How do I manage the `aws-auth` configmap?
+
+TL;DR - https://github.com/terraform-aws-modules/terraform-aws-eks/issues/1901
+
+- Users can roll their own equivalent of `kubectl patch ...` using the [`null_resource`](https://github.com/terraform-aws-modules/terraform-aws-eks/blob/9a99689cc13147f4afc426b34ba009875a28614e/examples/complete/main.tf#L301-L336)
+- There is a module that was created to fill this gap that provides a Kubernetes based approach to provision: https://github.com/aidanmelen/terraform-aws-eks-auth
+- Ideally, one of the following issues are resolved upstream for a more native experience for users:
+  - https://github.com/aws/containers-roadmap/issues/185
+  - https://github.com/hashicorp/terraform-provider-kubernetes/issues/723
+
+### I received an error: `Error: Invalid for_each argument ...`
+
+Users may encounter an error such as `Error: Invalid for_each argument - The "for_each" value depends on resource attributes that cannot be determined until apply, so Terraform cannot predict how many instances will be created. To work around this, use the -target argument to first apply ...`
+
+This error is due to an upstream issue with [Terraform core](https://github.com/hashicorp/terraform/issues/4149). There are two potential options you can take to help mitigate this issue:
+
+1. Create the dependent resources before the cluster => `terraform apply -target <your policy or your security group>` and then `terraform apply` for the cluster (or other similar means to just ensure the referenced resources exist before creating the cluster)
+
+- Note: this is the route users will have to take for adding additional security groups to nodes since there isn't a separate "security group attachment" resource
+
+2. For additional IAM policies, users can attach the policies outside of the cluster definition as demonstrated below
+
+```hcl
+resource "aws_iam_role_policy_attachment" "additional" {
+  for_each = module.eks.eks_managed_node_groups
+  # you could also do the following or any combination:
+  # for_each = merge(
+  #   module.eks.eks_managed_node_groups,
+  #   module.eks.self_managed_node_group,
+  #   module.eks.fargate_profile,
+  # )
+
+  #            This policy does not have to exist at the time of cluster creation. Terraform can
+  #            deduce the proper order of its creation to avoid errors during creation
+  policy_arn = aws_iam_policy.node_additional.arn
+  role       = each.value.iam_role_name
+}
+```
+
+TL;DR - Terraform resource passed into the modules map definition _must_ be known before you can apply the EKS module. The variables this potentially affects are:
+
+- `cluster_security_group_additional_rules` (i.e. - referencing an external security group resource in a rule)
+- `node_security_group_additional_rules` (i.e. - referencing an external security group resource in a rule)
+- `iam_role_additional_policies` (i.e. - referencing an external policy resource)
+
+- Setting `instance_refresh_enabled = true` will recreate your worker nodes without draining them first. It is recommended to install [aws-node-termination-handler](https://github.com/aws/aws-node-termination-handler) for proper node draining. See the [instance_refresh](https://github.com/terraform-aws-modules/terraform-aws-eks/tree/master/examples/irsa_autoscale_refresh) example provided.
+
+### Why are nodes not being registered?
+
+Nodes not being able to register with the EKS control plane is generally due to networking mis-configurations.
+
+1. At least one of the cluster endpoints (public or private) must be enabled.
+
+If you require a public endpoint, setting up both (public and private) and restricting the public endpoint via setting `cluster_endpoint_public_access_cidrs` is recommended. More info regarding communication with an endpoint is available [here](https://docs.aws.amazon.com/eks/latest/userguide/cluster-endpoint.html).
+
+2. Nodes need to be able to contact the EKS cluster endpoint. By default, the module only creates a public endpoint. To access the endpoint, the nodes need outgoing internet access:
+
+- Nodes in private subnets: via a NAT gateway or instance along with the appropriate routing rules
+- Nodes in public subnets: ensure that nodes are launched with public IPs (enable through either the module here or your subnet setting defaults)
+
+**Important: If you apply only the public endpoint and configure the `cluster_endpoint_public_access_cidrs` to restrict access, know that EKS nodes will also use the public endpoint and you must allow access to the endpoint. If not, then your nodes will fail to work correctly.**
+
+3. The private endpoint can also be enabled by setting `cluster_endpoint_private_access = true`. Ensure that VPC DNS resolution and hostnames are also enabled for your VPC when the private endpoint is enabled.
+
+4. Nodes need to be able to connect to other AWS services to function (download container images, make API calls to assume roles, etc.). If for some reason you cannot enable public internet access for nodes you can add VPC endpoints to the relevant services: EC2 API, ECR API, ECR DKR and S3.
+
+### Why are there no changes when a node group's `desired_size` is modified?
+
+The module is configured to ignore this value. Unfortunately, Terraform does not support variables within the `lifecycle` block. The setting is ignored to allow autoscaling via controllers such as cluster autoscaler or Karpenter to work properly and without interference by Terraform. Changing the desired count must be handled outside of Terraform once the node group is created.
+
+### How can I deploy Windows based nodes?
+
+To enable Windows support for your EKS cluster, you will need to apply some configuration manually. See the [Enabling Windows Support (Windows/MacOS/Linux)](https://docs.aws.amazon.com/eks/latest/userguide/windows-support.html#enable-windows-support).
+
+In addition, Windows based nodes require an additional cluster RBAC role (`eks:kube-proxy-windows`).
+
+Note: Windows based node support is limited to a default user data template that is provided due to the lack of Windows support and manual steps required to provision Windows based EKS nodes.
+
+### How do I access compute resource attributes?
+
+Examples of accessing the attributes of the compute resource(s) created by the root module are shown below. Note - the assumption is that your cluster module definition is named `eks` as in `module "eks" { ... }`:
+
+````hcl
+
+- EKS Managed Node Group attributes
+
+```hcl
+eks_managed_role_arns = [for group in module.eks_managed_node_group : group.iam_role_arn]
+````
+
+- Self Managed Node Group attributes
+
+```hcl
+self_managed_role_arns = [for group in module.self_managed_node_group : group.iam_role_arn]
+```
+
+- Fargate Profile attributes
+
+```hcl
+fargate_profile_pod_execution_role_arns = [for group in module.fargate_profile : group.fargate_profile_pod_execution_role_arn]
+```