Is your feature request related to a problem? Please describe.
Yes. The current AKS-managed Cluster Autoscaler (CA) is not compatible with the HAMi scheduler and device plugins, which are used to manage GPU resources and extended resources (such as nvidia.com/gpumem). The CA relies on a default kube-scheduler simulation that does not load HAMi’s custom logic or extended resources, resulting in inaccurate capacity estimates and repeated NotTriggerScaleUp events. This leads to GPU workloads not being scheduled or scaled up as needed. Additionally, pods may request full GPUs and specific memory allocations, which complicates fragmentation logic and confuses the autoscaler. AKS does not currently support third-party scheduler plugins like HAMi in its managed autoscaler, limiting support for advanced GPU scheduling scenarios.
Optional: Set the appropriate GitHub label(s) to indicate the AKS feature or area your feature request relates to.
Describe the solution you'd like
Enable a functional BYO (Bring Your Own) scheduler and BYO cluster-autoscaler solution on top of AKS that supports HAMi and similar advanced GPU scheduling use cases.
Allow deployment and integration of custom schedulers (such as HAMi) and custom autoscalers that are aware of extended resources and custom scheduling logic.
Ensure the custom autoscaler can accurately simulate scheduling decisions made by the custom scheduler, including support for extended resources and fragmentation logic.
Provide clear documentation and supported patterns for deploying and managing custom autoscalers and schedulers in AKS clusters.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.
Is your feature request related to a problem? Please describe.
Yes. The current AKS-managed Cluster Autoscaler (CA) is not compatible with the HAMi scheduler and device plugins, which are used to manage GPU resources and extended resources (such as nvidia.com/gpumem). The CA relies on a default kube-scheduler simulation that does not load HAMi’s custom logic or extended resources, resulting in inaccurate capacity estimates and repeated NotTriggerScaleUp events. This leads to GPU workloads not being scheduled or scaled up as needed. Additionally, pods may request full GPUs and specific memory allocations, which complicates fragmentation logic and confuses the autoscaler. AKS does not currently support third-party scheduler plugins like HAMi in its managed autoscaler, limiting support for advanced GPU scheduling scenarios.
Optional: Set the appropriate GitHub label(s) to indicate the AKS feature or area your feature request relates to.
Describe the solution you'd like
Enable a functional BYO (Bring Your Own) scheduler and BYO cluster-autoscaler solution on top of AKS that supports HAMi and similar advanced GPU scheduling use cases.
Allow deployment and integration of custom schedulers (such as HAMi) and custom autoscalers that are aware of extended resources and custom scheduling logic.
Ensure the custom autoscaler can accurately simulate scheduling decisions made by the custom scheduler, including support for extended resources and fragmentation logic.
Provide clear documentation and supported patterns for deploying and managing custom autoscalers and schedulers in AKS clusters.
Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.
Additional context
Add any other context or screenshots about the feature request here.