|
| 1 | +--- |
| 2 | +# Copyright Vespa.ai. All rights reserved. |
| 3 | +title: Configure Local Storage Type |
| 4 | +applies_to: enterprise |
| 5 | +--- |
| 6 | + |
| 7 | +<p> |
| 8 | + We recommend configuring node-local storage for the <a href="https://docs.vespa.ai/en/content/proton.html">content cluster</a> (i.e. the search core) to maximize |
| 9 | + performance by avoiding network I/O on the data path. In a standard Vespa deployment, this is controlled through |
| 10 | + the <code>storage-type</code> attribute under the <a href="https://docs.vespa.ai/en/reference/applications/services/services.html#resources">resources</a> tag in the <a href="https://docs.vespa.ai/en/basics/applications.html">application package</a>. |
| 11 | + However, that attribute has no effect when running Vespa on Kubernetes. Instead, local storage should be configured through the <code>spec.application.storageClass</code> field in the |
| 12 | + <code>VespaSet</code>. Vespa on Kubernetes abstracts away the concept of storage and will |
| 13 | + consume whatever is provided by the referenced storage class. |
| 14 | +</p> |
| 15 | + |
| 16 | +<p> |
| 17 | + For ConfigServer pods, storage performance is less critical; therefore, selecting a more cost-efficient network-attached storage class, such as <code>gp3</code> EBS volumes on Amazon EKS, is generally an appropriate tradeoff. |
| 18 | +</p> |
| 19 | + |
| 20 | +<p> |
| 21 | + To provision node-local storage, we recommend using Kubernetes <a href="https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/">Local Persistent Volumes</a>. These volumes expose |
| 22 | + <code>NodeAffinity</code> constraints to the Kubernetes scheduler, ensuring that Pods consuming them are scheduled |
| 23 | + onto nodes where the underlying storage is available. This avoids the need to manually manage NodeAffinity rules on per Pod. |
| 24 | +</p> |
| 25 | + |
| 26 | +<p> |
| 27 | + In addition, the Kubernetes Special Interest Groups (SIGs) provide an external |
| 28 | + <a href="https://kubernetes.io/blog/2019/04/04/kubernetes-1.14-local-persistent-volumes-ga/">Local Persistent Volume</a> |
| 29 | + static provisioner. This provisioner automatically discovers local disks mounted on each node and creates corresponding |
| 30 | + <code>PersistentVolumes</code>, while managing their lifecycle, including cleanup and reuse as Pods are deleted. We recommend using this |
| 31 | + component in production deployments. |
| 32 | +</p> |
| 33 | + |
| 34 | +<p> |
| 35 | + This guide walks through setting up local NVMe instance storage on EKS nodes using the <a href="https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner">Kubernetes Local Volume Static Provisioner</a>. |
| 36 | + This exposes the physical NVMe disks available on instances as a <code>local-nvme</code> StorageClass that Application Pods can claim. |
| 37 | + While this guide specifically targets an Amazon EKS setup, the concept is similar across different environments - refer to the <a href="https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/tree/master/helm/examples">project</a> for several other examples. |
| 38 | +</p> |
| 39 | + |
| 40 | +<h2> |
| 41 | + Setup Local Storage on Amazon EKS |
| 42 | +</h2> |
| 43 | + |
| 44 | +<p> |
| 45 | + This guide assumes that your EKS cluster has a Node Group configured with an instance type that supports local NVMe instance storage, |
| 46 | + such as <code>m7gd.xlarge</code>. These instance types typically contain the <code>d</code> suffix to designate themselves as specialized for workloads that require local instance storage. |
| 47 | + Refer to the <a href="https://docs.aws.amazon.com/eks/latest/userguide/managed-node-groups.html">AWS EKS Node Groups</a> documentation for further information on configuring Node Groups. |
| 48 | +</p> |
| 49 | + |
| 50 | +<p> |
| 51 | + This guide specifically targets Bottlerocket-based EKS Nodes. These Nodes do not execute the standard EKS bootstrap |
| 52 | + script responsible for preparing NVMe instance storage. |
| 53 | + Disk formatting and mounting is therefore handled by an init container, after which the static provisioner scans for |
| 54 | + available volumes and registers them as <code>PersistentVolumes</code>. |
| 55 | +</p> |
| 56 | + |
| 57 | +<p> |
| 58 | + Add the Helm repository for the Local Volume Static Provisioner. |
| 59 | +</p> |
| 60 | + |
| 61 | +<pre> |
| 62 | +$ helm repo add sig-storage-local-static-provisioner https://kubernetes-sigs.github.io/sig-storage-local-static-provisioner |
| 63 | +$ helm repo update |
| 64 | +</pre> |
| 65 | + |
| 66 | +<p> |
| 67 | + Create an EKS NVMe instance storage configuration. The example below will run an <a href="https://kubernetes.io/docs/concepts/workloads/pods/init-containers/">initContainer</a> |
| 68 | + that will scan for NVMe instance store disks and format them as <code>ext4</code> under <code>/mnt/disks</code>, which the static provisioner will detect. |
| 69 | +</p> |
| 70 | + |
| 71 | +<pre> |
| 72 | +cat <<'EOF' > local-nvme-values.yaml |
| 73 | +# EKS Bottlerocket NVMe instance storage configuration. |
| 74 | +classes: |
| 75 | + - name: local-nvme |
| 76 | + hostDir: /mnt/disks |
| 77 | + mountDir: /mnt/disks |
| 78 | + volumeMode: Filesystem |
| 79 | + fsType: ext4 |
| 80 | + accessMode: ReadWriteOnce |
| 81 | + storageClass: |
| 82 | + reclaimPolicy: Delete |
| 83 | + isDefaultClass: false |
| 84 | + |
| 85 | +nodeSelector: |
| 86 | + eks.amazonaws.com/nodegroup: test-node-group |
| 87 | + |
| 88 | +priorityClassName: system-node-critical |
| 89 | +mountDevVolume: true |
| 90 | + |
| 91 | +initContainers: |
| 92 | + - name: nvme-disk-setup |
| 93 | + image: registry.k8s.io/sig-storage/local-volume-provisioner:v2.8.0 |
| 94 | + securityContext: |
| 95 | + privileged: true |
| 96 | + command: |
| 97 | + - sh |
| 98 | + - -c |
| 99 | + - | |
| 100 | + set -eu |
| 101 | + |
| 102 | + DISKS_PATH=/mnt/disks |
| 103 | + |
| 104 | + disks=$(ls /dev/nvme*n1 2>/dev/null | grep -v '/dev/nvme0n1' || true) |
| 105 | + |
| 106 | + if [ -z "${disks}" ]; then |
| 107 | + echo "No NVMe instance-store disks found, nothing to do" |
| 108 | + exit 0 |
| 109 | + fi |
| 110 | + |
| 111 | + for disk in ${disks}; do |
| 112 | + echo "Processing ${disk}..." |
| 113 | + |
| 114 | + model=$(cat /sys/block/$(basename ${disk})/device/model 2>/dev/null || true) |
| 115 | + if ! echo "${model}" | grep -q "Amazon EC2 NVMe Instance Storage"; then |
| 116 | + echo "${disk} is not an instance store disk (model: ${model}), skipping" |
| 117 | + continue |
| 118 | + fi |
| 119 | + |
| 120 | + if grep -q "^${disk} " /proc/mounts; then |
| 121 | + echo "${disk} is already mounted, skipping" |
| 122 | + continue |
| 123 | + fi |
| 124 | + |
| 125 | + if ! blkid "${disk}" >/dev/null 2>&1; then |
| 126 | + echo "No filesystem on ${disk}, formatting as ext4..." |
| 127 | + mkfs.ext4 -F "${disk}" |
| 128 | + fi |
| 129 | + |
| 130 | + uuid=$(blkid -s UUID -o value "${disk}") |
| 131 | + if [ -z "${uuid}" ]; then |
| 132 | + echo "Could not determine UUID for ${disk}, skipping" |
| 133 | + continue |
| 134 | + fi |
| 135 | + |
| 136 | + mount_point="${DISKS_PATH}/${uuid}" |
| 137 | + mkdir -p "${mount_point}" |
| 138 | + echo "Mounting ${disk} (UUID=${uuid}) at ${mount_point}" |
| 139 | + mount "${disk}" "${mount_point}" |
| 140 | + done |
| 141 | + |
| 142 | + echo "Setup complete. Disks mounted under ${DISKS_PATH}:" |
| 143 | + grep "${DISKS_PATH}" /proc/mounts || echo " (none found)" |
| 144 | + volumeMounts: |
| 145 | + - name: provisioner-dev |
| 146 | + mountPath: /dev |
| 147 | + - name: local-nvme |
| 148 | + mountPath: /mnt/disks |
| 149 | + mountPropagation: Bidirectional |
| 150 | + |
| 151 | +resources: |
| 152 | + requests: |
| 153 | + cpu: 10m |
| 154 | + memory: 32Mi |
| 155 | + limits: |
| 156 | + cpu: 100m |
| 157 | + memory: 128Mi |
| 158 | +EOF |
| 159 | + |
| 160 | +$ helm install local-volume-provisioner \ |
| 161 | + sig-storage-local-static-provisioner/local-static-provisioner \ |
| 162 | + --namespace kube-system \ |
| 163 | + --values local-nvme-values.yaml |
| 164 | +</pre> |
| 165 | + |
| 166 | +<p> |
| 167 | + <code>mountPropagation: Bidirectional</code> will ensure that the volume mount is propagated back to the host, and <code>priorityClassName: system-node-critical</code> |
| 168 | + ensures the provisioner Pod will not be evicted in the case of Node pressure. |
| 169 | +</p> |
| 170 | + |
| 171 | +<p> |
| 172 | + After installing the static provisioner, a <code>StorageClass</code> type of <code>local-nvme</code> will be created. This |
| 173 | + should be used in the <code>spec.application.storageClass</code> attribute of the <code>VespaSet</code>. |
| 174 | +</p> |
| 175 | + |
| 176 | +<pre> |
| 177 | +$ kubectl get storageclasses |
| 178 | +NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE |
| 179 | +local-nvme kubernetes.io/no-provisioner Delete WaitForFirstConsumer false 12h |
| 180 | +</pre> |
| 181 | + |
| 182 | +<p> |
| 183 | + Ensure that the <code>VolumeBindingMode</code> is <code>WaitForFirstConsumer</code> to delay |
| 184 | + <code>PersistentVolume</code> binding until a Pod is scheduled, allowing the scheduler to place the Pod on a Node where the |
| 185 | + storage physically resides. |
| 186 | +</p> |
| 187 | + |
| 188 | +<p> |
| 189 | + After the <code>initContainer</code> has completed, the static provisioner will provision <code>PersistentVolumes</code>. |
| 190 | +</p> |
| 191 | + |
| 192 | +<pre> |
| 193 | +$ kubectl get persistentvolumes |
| 194 | +NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS VOLUMEATTRIBUTESCLASS REASON AGE |
| 195 | +local-pv-201c66f3 216Gi RWO Delete Available local-nvme <unset> 12h |
| 196 | +local-pv-2942e993 216Gi RWO Delete Available local-nvme <unset> 12h |
| 197 | +local-pv-2fea7934 216Gi RWO Delete Available local-nvme <unset> 12h |
| 198 | +local-pv-335a2831 216Gi RWO Delete Available local-nvme <unset> 12h |
| 199 | +local-pv-3499cebf 216Gi RWO Delete Available local-nvme <unset> 12h |
| 200 | +local-pv-36dc72b5 216Gi RWO Delete Available local-nvme <unset> 12h |
| 201 | +local-pv-37928b3d 216Gi RWO Delete Available local-nvme <unset> 12h |
| 202 | +local-pv-5e09d438 216Gi RWO Delete Available local-nvme <unset> 12h |
| 203 | +local-pv-6e9849a9 216Gi RWO Delete Available local-nvme <unset> 12h |
| 204 | +</pre> |
| 205 | + |
| 206 | +<p> |
| 207 | + Configure the <code>VespaSet</code> to use the newly created <code>StorageClass</code>. For example: |
| 208 | +</p> |
| 209 | + |
| 210 | +<pre> |
| 211 | +# vespaset sample for EKS with local storage configured |
| 212 | +$ cat > vespaset.yaml <<EOF |
| 213 | +apiVersion: k8s.ai.vespa/v1 |
| 214 | +kind: VespaSet |
| 215 | +metadata: |
| 216 | + name: vespaset-sample |
| 217 | + namespace: ${NAMESPACE} |
| 218 | +spec: |
| 219 | + version: "${VESPA_VERSION}" |
| 220 | + |
| 221 | + configServer: |
| 222 | + image: "${VESPA_OPERATOR_IMAGE}" |
| 223 | + storageClass: "gp3" |
| 224 | + generateRbac: false |
| 225 | + |
| 226 | + application: |
| 227 | + image: "${VESPA_IMAGE}" |
| 228 | + storageClass: "local-nvme" |
| 229 | + |
| 230 | + ingress: |
| 231 | + endpointType: "LOAD_BALANCER" |
| 232 | +EOF |
| 233 | + |
| 234 | +$ kubectl apply -f vespaset.yaml |
| 235 | +</pre> |
| 236 | + |
| 237 | +<h2> |
| 238 | + Other Provisioners |
| 239 | +</h2> |
| 240 | + |
| 241 | +<p> |
| 242 | + Several other local storage provisioners, such as <a href="https://github.com/openebs/dynamic-localpv-provisioner">OpenEBS Dynamic LocalPV Provisioner</a> and <a href="https://github.com/topolvm/topolvm">TopoLVM</a> |
| 243 | + may be used as alternatives. These provisioners offer dynamic volume provisioning, creating PersistentVolumes on demand rather than pre-provisioning them, |
| 244 | + which may be preferable in environments where disk availability changes frequently. |
| 245 | + |
| 246 | + However, some provisioners may require manual configuration of <code>NodeAffinity</code> rules to ensure pods are scheduled on nodes where the storage physically resides. |
| 247 | + In these cases, refer to the <a href="../custom-overrides-podtemplate.html">PodTemplates</a> section on configuring custom <code>NodeAffinity</code> rules for ConfigServer and Application Pods. |
| 248 | +</p> |
| 249 | + |
| 250 | + |
0 commit comments