We are currently running a 5 node cluster in AKS with all-in-all 10vcpus and 35gb ram. We noticed following behavior: We have a couple of StatefulSets which claim an Azure Disk with some storage. During runtime, the pod is going into a CrashLoop because the volume is suddenly not accessible anymore (io-error). This leads to a crash of the application running in the pod. The health probe recognizes that, restarts, and crashes again. We managed to keep a container running and we were not able to access the volume anymore (but it was still mounted in the OS).
The solution to this problem was usually to manually delete the pod. After rescheduling it suddenly worked again.
In the past this happened only a few times, until yesterday. Yesterday, we had the same issues and as soon we deleted the failing pod, and this pod was rescheduled in running state, another pod was crashing. We always had 4 failing pods with IO-Errors which made us think if has something to do with the total amount of mounted Azure Disks.
We have the following assumption:
If a new pod is scheduled on a node, which already has 4 mounted Azure Disks, one of the running pods (which is claiming one of the volumes) "loses" access to his volume and therefore crashes. Additionally we found the following link which restricts the amount of Azure Disks that can be mounted on a VM (Link)
What we would expect:
If our assumption is correct, i would expect following behaviour:
- Pods whith PVC to Azure Disk PV should not be scheduled to a physical node which has already a maximum of volumes mounted
- If this is not possible: The newly scheduled Pod should not be able to be scheduled on the node and therefore throw an error (instead of making an already scheduled Pod crash)
Have you observed something similar in the past?
Here some information about our system (reducted private information):
Name: aks-agentpool-(reducted)
Roles: agent
Labels: agentpool=agentpool
beta.kubernetes.io/arch=amd64
beta.kubernetes.io/instance-type=Standard_D2_v2
beta.kubernetes.io/os=linux
failure-domain.beta.kubernetes.io/region=westeurope
failure-domain.beta.kubernetes.io/zone=0
kubernetes.azure.com/cluster=(reducted)
kubernetes.io/hostname=aks-agentpool-(reducted)
kubernetes.io/role=agent
storageprofile=managed
storagetier=Standard_LRS
Annotations: node.alpha.kubernetes.io/ttl=0
volumes.kubernetes.io/controller-managed-attach-detach=true
Taints: <none>
CreationTimestamp: Tue, 20 Feb 2018 17:07:16 +0100
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message
---- ------ ----------------- ------------------ ------ -------
NetworkUnavailable False Tue, 20 Feb 2018 17:07:42 +0100 Tue, 20 Feb 2018 17:07:42 +0100 RouteCreated RouteController created a route
OutOfDisk False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasSufficientDisk kubelet has sufficient disk space available
MemoryPressure False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure
Ready True Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:36 +0100 KubeletReady kubelet is posting ready status
Addresses:
InternalIP: (reducted)
Hostname: (reducted)
Capacity:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 2
memory: 7114304Ki
pods: 110
Allocatable:
alpha.kubernetes.io/nvidia-gpu: 0
cpu: 2
memory: 7011904Ki
pods: 110
System Info:
Machine ID: (reducted)
System UUID: (reducted)
Boot ID: (reducted)
Kernel Version: 4.13.0-1007-azure
OS Image: Debian GNU/Linux 8 (jessie)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://1.12.6
Kubelet Version: v1.8.7
Kube-Proxy Version: v1.8.7
PodCIDR: 10.244.4.0/24
ExternalID: (reducted)
We are currently running a 5 node cluster in AKS with all-in-all 10vcpus and 35gb ram. We noticed following behavior: We have a couple of StatefulSets which claim an Azure Disk with some storage. During runtime, the pod is going into a CrashLoop because the volume is suddenly not accessible anymore (io-error). This leads to a crash of the application running in the pod. The health probe recognizes that, restarts, and crashes again. We managed to keep a container running and we were not able to access the volume anymore (but it was still mounted in the OS).
The solution to this problem was usually to manually delete the pod. After rescheduling it suddenly worked again.
In the past this happened only a few times, until yesterday. Yesterday, we had the same issues and as soon we deleted the failing pod, and this pod was rescheduled in running state, another pod was crashing. We always had 4 failing pods with IO-Errors which made us think if has something to do with the total amount of mounted Azure Disks.
We have the following assumption:
If a new pod is scheduled on a node, which already has 4 mounted Azure Disks, one of the running pods (which is claiming one of the volumes) "loses" access to his volume and therefore crashes. Additionally we found the following link which restricts the amount of Azure Disks that can be mounted on a VM (Link)
What we would expect:
If our assumption is correct, i would expect following behaviour:
Have you observed something similar in the past?
Here some information about our system (reducted private information):
Name: aks-agentpool-(reducted) Roles: agent Labels: agentpool=agentpool beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=Standard_D2_v2 beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=westeurope failure-domain.beta.kubernetes.io/zone=0 kubernetes.azure.com/cluster=(reducted) kubernetes.io/hostname=aks-agentpool-(reducted) kubernetes.io/role=agent storageprofile=managed storagetier=Standard_LRS Annotations: node.alpha.kubernetes.io/ttl=0 volumes.kubernetes.io/controller-managed-attach-detach=true Taints: <none> CreationTimestamp: Tue, 20 Feb 2018 17:07:16 +0100 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Tue, 20 Feb 2018 17:07:42 +0100 Tue, 20 Feb 2018 17:07:42 +0100 RouteCreated RouteController created a route OutOfDisk False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasSufficientDisk kubelet has sufficient disk space available MemoryPressure False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:16 +0100 KubeletHasNoDiskPressure kubelet has no disk pressure Ready True Wed, 21 Feb 2018 09:49:47 +0100 Tue, 20 Feb 2018 17:07:36 +0100 KubeletReady kubelet is posting ready status Addresses: InternalIP: (reducted) Hostname: (reducted) Capacity: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 2 memory: 7114304Ki pods: 110 Allocatable: alpha.kubernetes.io/nvidia-gpu: 0 cpu: 2 memory: 7011904Ki pods: 110 System Info: Machine ID: (reducted) System UUID: (reducted) Boot ID: (reducted) Kernel Version: 4.13.0-1007-azure OS Image: Debian GNU/Linux 8 (jessie) Operating System: linux Architecture: amd64 Container Runtime Version: docker://1.12.6 Kubelet Version: v1.8.7 Kube-Proxy Version: v1.8.7 PodCIDR: 10.244.4.0/24 ExternalID: (reducted)