Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 13 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,16 +30,22 @@ Easily manage Amazon EKS clusters and workloads with CloudPilot AI's automation

## Example Usage

See the [`examples/`](examples/) directory for real-world configurations:
See the [`examples/`](examples/) directory for real-world configurations.

**All examples default to basic installation without enabling Node Autoscaler optimization.** Optimization options (`only_install_agent`, `enable_rebalance`) are defined as variables — modify them in `terraform.tfvars` and re-apply to enable optimization when ready.

| Example | Description | Use Case |
|---------|-------------|----------|
| [`0_details`](examples/nodeautoscale/eks/0_details/) | Full-featured EKS cluster with all options | Production setup with workload templates, nodeclasses, and complete configuration |
| [`1_read-only_access`](examples/nodeautoscale/eks/1_read-only_access/) | Agent-only installation | Testing or monitoring without optimization changes |
| [`2_basic_rebalance`](examples/nodeautoscale/eks/2_basic_rebalance/) | Basic rebalance enabled | Simple cost optimization with workload rebalancing |
| [`3_nodeclass_nodepool_rebalance`](examples/nodeautoscale/eks/3_nodeclass_nodepool_rebalance/) | Custom nodeclass/nodepool | Advanced node management with custom configurations |

Each example folder contains a `main.tf` and a dedicated README with usage instructions.
| [`0_details`](examples/nodeautoscale/eks/0_details/) | Full-featured reference with all options | Production setup with workload templates, nodeclasses, nodepools, Workload Autoscaler, and data sources |
| [`1_read-only_access`](examples/nodeautoscale/eks/1_read-only_access/) | Minimal agent-only installation | Testing or monitoring without any optimization changes |
| [`2_basic_rebalance`](examples/nodeautoscale/eks/2_basic_rebalance/) | Basic rebalance configuration | Simple cost optimization with rebalancing |
| [`3_nodeclass_nodepool_rebalance`](examples/nodeautoscale/eks/3_nodeclass_nodepool_rebalance/) | Custom nodeclass and nodepool | Advanced node management with instance filtering and disruption controls |

Each example folder contains:
- `main.tf` — resource definitions
- `variables.tf` — variable declarations (including optimization toggles)
- `terraform.tfvars.example` — sample variable values (copy to `terraform.tfvars` to use)
- `README.md` — usage instructions with a two-step workflow (install → enable optimization)

---

Expand Down
46 changes: 46 additions & 0 deletions docs/data-sources/eks_cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
page_title: "cloudpilotai_eks_cluster Data Source - cloudpilotai"
subcategory: "Node Autoscale"
description: |-
Retrieves information about an existing EKS cluster registered with CloudPilot AI.
---

# cloudpilotai_eks_cluster (Data Source)

Retrieves read-only information about an EKS cluster that is already registered with CloudPilot AI. Use this data source to query cluster status and agent information without making any changes.

## Example Usage

```terraform
data "cloudpilotai_eks_cluster" "production" {
cluster_name = "production-cluster"
region = "us-west-2"
}

output "cluster_status" {
value = data.cloudpilotai_eks_cluster.production.status
}

output "agent_version" {
value = data.cloudpilotai_eks_cluster.production.agent_version
}
```

## Schema

### Required

- `cluster_name` (String) — Name of the EKS cluster.
- `region` (String) — AWS region where the EKS cluster is located.

### Optional

- `account_id` (String) — AWS account ID. If not provided, it is auto-detected from the current AWS CLI credentials.

### Read-Only

- `cluster_id` (String) — CloudPilot AI cluster identifier.
- `cloud_provider` (String) — Cloud provider (e.g. `aws`).
- `status` (String) — Current cluster status: `online`, `offline`, or `demo`.
- `agent_version` (String) — Version of the CloudPilot AI agent installed.
- `rebalance_enable` (Boolean) — Whether rebalancing is enabled.
37 changes: 37 additions & 0 deletions docs/data-sources/workload_autoscaler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
page_title: "cloudpilotai_workload_autoscaler Data Source - cloudpilotai"
subcategory: "Workload Autoscaler"
description: |-
Retrieves the Workload Autoscaler configuration for a given cluster.
---

# cloudpilotai_workload_autoscaler (Data Source)

Retrieves read-only information about the Workload Autoscaler configuration on a cluster registered with CloudPilot AI. Use this data source to check whether the autoscaler is enabled and installed without making any changes.

## Example Usage

```terraform
data "cloudpilotai_workload_autoscaler" "current" {
cluster_id = cloudpilotai_eks_cluster.my_cluster.cluster_id
}

output "wa_enabled" {
value = data.cloudpilotai_workload_autoscaler.current.enabled
}

output "wa_installed" {
value = data.cloudpilotai_workload_autoscaler.current.installed
}
```

## Schema

### Required

- `cluster_id` (String) — The CloudPilot AI cluster ID.

### Read-Only

- `enabled` (Boolean) — Whether the Workload Autoscaler is enabled on this cluster.
- `installed` (Boolean) — Whether the Workload Autoscaler is installed on this cluster.
49 changes: 49 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
---
page_title: "CloudPilot AI Provider"
subcategory: ""
description: |-
The CloudPilot AI provider enables Terraform to manage EKS clusters and workload autoscaling with CloudPilot AI's cost optimization platform.
---

# CloudPilot AI Provider

The CloudPilot AI provider enables you to manage Amazon EKS clusters and workloads through [CloudPilot AI](https://cloudpilot.ai/)'s automation and cost optimization platform.

## Features

- Provision and manage EKS clusters with CloudPilot AI integration
- Automated agent and rebalance component installation
- Node pool and node class management (including custom Karpenter JSON)
- Workload cost optimization (rebalance, spot-friendly, min non-spot replicas)
- Workload Autoscaler with recommendation and autoscaling policies
- Read-only data sources for querying existing cluster and autoscaler state

## Prerequisites

- [Terraform](https://developer.hashicorp.com/terraform/install) >= 1.0
- [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html) configured with EKS permissions
- [Kubectl](https://kubernetes.io/docs/tasks/tools/) for cluster operations
- A CloudPilot AI API key — see [Getting API Keys](https://docs.cloudpilot.ai/guide/getting_started/get_apikeys)

## Example Usage

```terraform
provider "cloudpilotai" {
api_key = var.cloudpilotai_api_key
}
```

## Authentication

The provider requires a CloudPilot AI API key. You can supply it in two ways:

- **`api_key`** — Pass the key directly (use a Terraform variable to avoid hardcoding).
- **`api_key_profile`** — Path to a file containing the API key.

## Schema

### Optional

- `api_key` (String, Sensitive) — API key for the CloudPilot AI API.
- `api_key_profile` (String) — Path to a file containing the API key.
- `api_endpoint` (String) — Custom API endpoint. Defaults to `https://api.cloudpilot.ai`.
92 changes: 92 additions & 0 deletions docs/resources/eks_cluster.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
---
page_title: "cloudpilotai_eks_cluster Resource - cloudpilotai"
subcategory: "Node Autoscale"
description: |-
Manages an EKS cluster with CloudPilot AI agent, rebalance components, and node configuration.
---

# cloudpilotai_eks_cluster (Resource)

Manages an EKS cluster registered with CloudPilot AI. This resource handles the full lifecycle: installing the CloudPilot AI agent, configuring rebalance settings, and managing node pools and node classes.

## Example Usage

### Read-Only Agent Installation

```terraform
resource "cloudpilotai_eks_cluster" "readonly" {
cluster_name = "my-eks-cluster"
region = "us-west-2"
restore_node_number = 0
only_install_agent = true
}
```

### Basic Rebalance

```terraform
resource "cloudpilotai_eks_cluster" "rebalance" {
cluster_name = "my-eks-cluster"
region = "us-west-2"
restore_node_number = 3
enable_rebalance = true
}
```

### With Node Classes and Node Pools

```terraform
resource "cloudpilotai_eks_cluster" "full" {
cluster_name = "my-eks-cluster"
region = "us-west-2"
restore_node_number = 3
enable_rebalance = true

nodeclasses {
name = "default"
system_disk_size_gib = 30
}

nodepools {
name = "default"
nodeclass = "default"
enable = true
capacity_type = ["spot", "on-demand"]
instance_arch = ["amd64"]
}
}
```

## Schema

### Required

- `cluster_name` (String) — Name of the EKS cluster to be managed.
- `region` (String) — AWS region where the EKS cluster is located.
- `restore_node_number` (Number) — Number of nodes to restore when deleting the cluster resource. Set to 0 if no nodes need restoring.

### Optional

- `kubeconfig` (String) — Path to the kubeconfig file. If not provided, the provider generates one using AWS CLI.
- `account_id` (String) — AWS account ID. Auto-detected from AWS CLI if not set.
- `disable_workload_uploading` (Boolean) — Disable uploading workload information. Default: `false`.
- `only_install_agent` (Boolean) — Only install the agent without rebalance. Default: `false`.
- `enable_upgrade_agent` (Boolean) — Upgrade the agent on next apply. Default: `false`.
- `enable_upgrade_rebalance_component` (Boolean) — Upgrade the rebalance component. Default: `false`.
- `enable_rebalance` (Boolean) — Enable automatic workload rebalancing. Default: `false`.
- `enable_upload_config` (Boolean) — Upload nodepool/nodeclass config to CloudPilot AI. Default: `true`.
- `enable_diversity_instance_type` (Boolean) — Enable diverse instance types. Default: `false`.
- `workload_templates` (List of Object) — Workload template configurations.
- `workloads` (List of Object) — Workload rebalance configurations.
- `nodeclass_templates` (List of Object) — NodeClass template configurations.
- `nodeclasses` (List of Object) — NodeClass configurations.
- `nodepool_templates` (List of Object) — NodePool template configurations.
- `nodepools` (List of Object) — NodePool configurations.

### Read-Only

- `cluster_id` (String) — Unique identifier of the cluster (computed).

## Import

This resource does not support import.
134 changes: 134 additions & 0 deletions docs/resources/workload_autoscaler.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
---
page_title: "cloudpilotai_workload_autoscaler Resource - cloudpilotai"
subcategory: "Workload Autoscaler"
description: |-
Manages the CloudPilot AI Workload Autoscaler with recommendation and autoscaling policies.
---

# cloudpilotai_workload_autoscaler (Resource)

Manages the CloudPilot AI Workload Autoscaler on a Kubernetes cluster. This resource installs the autoscaler components, configures recommendation policies, and sets up autoscaling policies for workload right-sizing.

## Example Usage

```terraform
resource "cloudpilotai_workload_autoscaler" "example" {
cluster_id = cloudpilotai_eks_cluster.my_cluster.cluster_id
kubeconfig = "/path/to/kubeconfig"

recommendation_policies {
name = "default-rp"
strategy_type = "percentile"
percentile_cpu = 95
percentile_memory = 95
history_window_cpu = "168h"
history_window_memory = "168h"
evaluation_period = "1h"
}

autoscaling_policies {
name = "default-ap"
enable = true
recommendation_policy_name = "default-rp"

target_refs {
api_version = "apps/v1"
kind = "Deployment"
}

update_schedules {
name = "default"
mode = "inplace"
}
}

enable_proactive = [
{
namespaces = ["my-namespace"]
}
]

disable_proactive = [
{
namespaces = ["kube-system"]
}
]
}
```

## Schema

### Required

- `cluster_id` (String) — The CloudPilot AI cluster ID to deploy Workload Autoscaler on.
- `kubeconfig` (String) — Path to the kubeconfig file for the target Kubernetes cluster.

### Optional

- `storage_class` (String) — StorageClass name for VictoriaMetrics persistent volume. Default: cluster default.
- `enable_node_agent` (Boolean) — Enable the Node Agent DaemonSet for per-node metrics. Default: `true`.
- `recommendation_policies` (List of Object) — List of recommendation policies. See [Recommendation Policy](#recommendation-policy) below.
- `autoscaling_policies` (List of Object) — List of autoscaling policies. See [Autoscaling Policy](#autoscaling-policy) below.
- `enable_proactive` (List of Object) — Workload filters to enable proactive optimization. See [Proactive Filter](#proactive-filter) below.
- `disable_proactive` (List of Object) — Workload filters to disable proactive optimization. See [Proactive Filter](#proactive-filter) below.

### Recommendation Policy

Each recommendation policy supports:

| Attribute | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | String | Yes | Policy name |
| `strategy_type` | String | No | Strategy type (`percentile`). Default: `percentile` |
| `percentile_cpu` | Number | No | CPU percentile (50-100). Default: `95` |
| `percentile_memory` | Number | No | Memory percentile (50-100). Default: `95` |
| `history_window_cpu` | String | Yes | CPU history window duration (e.g. `168h`) |
| `history_window_memory` | String | Yes | Memory history window duration |
| `evaluation_period` | String | Yes | Evaluation period duration (e.g. `1h`) |
| `buffer_cpu` | String | No | CPU buffer (e.g. `10%` or `100m`) |
| `buffer_memory` | String | No | Memory buffer (e.g. `10%` or `128Mi`) |
| `request_min_cpu` | String | No | Minimum CPU request recommendation |
| `request_min_memory` | String | No | Minimum Memory request recommendation |
| `request_max_cpu` | String | No | Maximum CPU request recommendation |
| `request_max_memory` | String | No | Maximum Memory request recommendation |

### Autoscaling Policy

Each autoscaling policy supports:

| Attribute | Type | Required | Description |
|-----------|------|----------|-------------|
| `name` | String | Yes | Policy name |
| `enable` | Boolean | No | Whether enabled. Default: `true` |
| `recommendation_policy_name` | String | Yes | Associated recommendation policy |
| `priority` | Number | No | Priority (higher wins). Default: `0` |
| `update_resources` | List(String) | No | Resources to optimize (e.g. `["cpu", "memory"]`) |
| `drift_threshold_cpu` | String | No | CPU drift threshold |
| `drift_threshold_memory` | String | No | Memory drift threshold |
| `on_policy_removal` | String | No | Behavior on removal: `off`, `recreate`, `inplace`. Default: `off` |
| `target_refs` | List(Object) | No | Target workload references |
| `update_schedules` | List(Object) | No | Update schedule items |
| `limit_policies` | List(Object) | No | Per-resource limit policies |
| `startup_boost_enabled` | Boolean | No | Enable startup resource boost. Default: `false` |
| `in_place_fallback_default_policy` | String | No | Fallback policy: `recreate` or `hold` |

### Proactive Filter

Each `enable_proactive` and `disable_proactive` entry supports the same set of filter attributes:

| Attribute | Type | Required | Description |
|-----------|------|----------|-------------|
| `workload_name` | String | No | Filter by workload name (substring match) |
| `namespaces` | List(String) | No | Namespaces to filter workloads |
| `workload_kinds` | List(String) | No | Workload kinds (e.g. `Deployment`, `StatefulSet`) |
| `autoscaling_policy_names` | List(String) | No | Filter by autoscaling policy names |
| `workload_state` | String | No | Filter by workload state |
| `optimization_states` | List(String) | No | Filter by optimization states |
| `disable_proactive_update` | Boolean | No | Filter by whether proactive update is disabled |
| `recommendation_policy_names` | List(String) | No | Filter by recommendation policy names |
| `runtime_languages` | List(String) | No | Filter by container runtime languages |
| `optimized` | Boolean | No | Filter by whether the workload is optimized |

## Import

This resource does not support import.
Loading
Loading