-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
4.19.z, 4.20.z
Description of problem:
When adding nodes to a nodepool in a dual-stack HCP scenario, nodes doesn't get the CSR automatically approved and even if manually approved the nodes has a taint to avoid pods running.
Checking "kubevirt-cloud-controller-manager" pod messages like these appears:
E0115 15:09:42.974850 1 node_controller.go:236] error syncing 'temp-xjl6k': failed to get node modifiers from cloud provider: provided node ip for node "temp-xjl6k" is not valid: failed to parse node IP "10.x.x.x,2001:x:x:x:x:x:x:x": dual-stack not supported in this configuration, requeuing
The node object clearly states the annotation:
alpha.kubernetes.io/provided-node-ip: 10.x.x.x,2001:x:x:x:x:x:x:x
Checking the error log in the code i manage to get to the function in:
https://github.com/openshift/cloud-provider-kubevirt/blob/3f4542ecd17fb0e47da4c6d9bceb076b98fb314b/vendor/k8s.io/component-helpers/node/util/ips.go#L81
which dualStack activation depends on a featureGate called "CloudDualStackNodeIPs" [1] which is not present anymore in openshift in 4.19 and onwards. This makes the verification to always fail in dual stack environments and holds the nodes from being Ready on the cluster.
In recent versions of this kubernetes package the dualstack variable is set to True [2] always.
[1] https://github.com/openshift/cloud-provider-kubevirt/blob/3f4542ecd17fb0e47da4c6d9bceb076b98fb314b/vendor/k8s.io/cloud-provider/controllers/node/node_controller.go#L744
[2] https://github.com/openshift/cloud-provider-kubevirt/blob/3f4542ecd17fb0e47da4c6d9bceb076b98fb314b/vendor/k8s.io/component-helpers/node/util/ips.go#L81
Version-Release number of selected component (if applicable):
Hosting cluster: 4.19.14 Hosted cluster: 4.20.10
How reproducible:
Always
Steps to Reproduce:
1. Install a Hub cluster with dual stack
2. Install HCP cluster with kubevirt
3. Add nodes to nodepool
4. monitor "kubevirt-cloud-controller-manager" pod logs
Actual results:
Nodes cannot be added to the hosted cluster
Expected results:
"kubevirt-cloud-controller-manager" doesn't fail the verification and nodes are added as expected.
Additional info:
As a possible workaround we can pause the HCP reconciliation by adding:
pausedUntil: "true"
in the spec of the hostedCluster and then modify the deployment "kubevirt-cloud-controller-manager" by adding the feature-gate parameter, eg.:
...
containers:
- args:
- --cloud-provider=kubevirt
- --cloud-config=/etc/cloud/cloud-config
- --kubeconfig=/etc/kubernetes/kubeconfig/kubeconfig
- --authentication-skip-lookup
- --cluster-name=tc-temp
- --feature-gates CloudDualStackNodeIPs=true
command:
- /bin/kubevirt-cloud-controller-manager
...
- relates to
-
OCPBUGS-74338 [OCP 4.20] Limitation in KubevirtMachine object to only hold 1 ip address in dual-stack environments
-
- New
-