Context: We want to move our node provisioning process to aks-node-controller, as the "future" bootstrapping mechanism in Azure. I noticed some minor gaps in the controller's flow, raising those for consideration.
Issue 1: Missing Cilium option
This one sounds trivial, unless I am missing something. In GetNetworkPolicyType, there is no mapping for Cilium. If I understand correctly, with Cilium turned on, then both NetworkPolicy and NetworkDataplane would be cilium. So it should be a matter of adding an enum value, mapping and output to script?
Issue 2: Configuring NetworkPlugin correctly
Passing the correct value here took me some trial and error for the different CNI options. I assumed GetNetworkPluginType is sufficient, but it failed in real clusters for Azure CNI combos.
Empirically, this is what I found:
- Azure CNI overlay ->
none
- Azure CNI overlay + Cilium ->
none
- Azure CNI node subnet ->
azure
Not 100% sure if that is correct, but nodes seem to join OK and network was working.
I assume this is one of those slow-moving items that "once you configure right, you touch every 1-2 years tops". Therefore, adding docs on how to pick up the value correctly from the start would be great.
As long as the project accepts contributions (and with some hand-holding), I'd be happy to open PRs for those changes, if you believe they make sense.
Context: We want to move our node provisioning process to aks-node-controller, as the "future" bootstrapping mechanism in Azure. I noticed some minor gaps in the controller's flow, raising those for consideration.
Issue 1: Missing Cilium option
This one sounds trivial, unless I am missing something. In GetNetworkPolicyType, there is no mapping for Cilium. If I understand correctly, with Cilium turned on, then both
NetworkPolicyandNetworkDataplanewould becilium. So it should be a matter of adding an enum value, mapping and output to script?Issue 2: Configuring NetworkPlugin correctly
Passing the correct value here took me some trial and error for the different CNI options. I assumed GetNetworkPluginType is sufficient, but it failed in real clusters for Azure CNI combos.
Empirically, this is what I found:
nonenoneazureNot 100% sure if that is correct, but nodes seem to join OK and network was working.
I assume this is one of those slow-moving items that "once you configure right, you touch every 1-2 years tops". Therefore, adding docs on how to pick up the value correctly from the start would be great.
As long as the project accepts contributions (and with some hand-holding), I'd be happy to open PRs for those changes, if you believe they make sense.