-
Bug
-
Resolution: Unresolved
-
Critical
-
None
-
None
-
None
-
0.42
-
False
-
-
False
-
CNV v4.18.0.rhel9-575, CNV v4.17.3.rhel9-71
-
---
-
---
-
-
CNV I/U Operators Sprint 264, CNV Virt-Node Sprint 263
-
Critical
-
None
Description of problem:
OpenShift Virtualization is installed on one of the hosted cluster in baremetal hosted control plane. During the upgrade, "virt-operator-strategy-dumper" job is created. The job have the following affinity:
# oc get job kubevirt-038e69c566dacce9f32740e77415213008080ae5-jobtnj4x -o yaml |yq '.spec.template.spec.affinity' nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - preference: matchExpressions: - key: node-role.kubernetes.io/worker operator: DoesNotExist weight: 100 requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: node-role.kubernetes.io/control-plane operator: Exists - matchExpressions: - key: node-role.kubernetes.io/master operator: Exists
There is no master node in the hosted cluster since control plane runs on management/hub cluster.
So pod fails to schedule:
# oc get events|grep kubevirt-038e69c566dacce9f32740e77415213008080ae5-jobtnj4xsh5xd Unknown Warning FailedScheduling Pod/kubevirt-038e69c566dacce9f32740e77415213008080ae5-jobtnj4xsh5xd 0/2 nodes are available: 2 node(s) didn't match Pod's node affinity/selector. preemption: 0/2 nodes are available: 2 Preemption is not helpful for scheduling.
And the upgrade is stuck with following error in virt-operator:
2024-12-01T20:47:45.383727766Z {"component":"virt-operator","level":"info","msg":"Install strategy config map not loaded. reason: no install strategy configmap found for version sha256:99880052bdb7d896c4fa39a7d2956ce4be576a59ac4c398456cffccf04e17e57 with registry ","pos":"kubevirt.go:866","timestamp":"2024-12-01T20:47:45.383707Z"} 2024-12-01T20:47:45.383749320Z {"component":"virt-operator","kind":"","level":"error","msg":"Waiting on install strategy to be posted from job kubevirt-038e69c566dacce9f32740e77415213008080ae5-jobtnj4x","name":"kubevirt-038e69c566dacce9f32740e77415213008080ae5-jobtnj4x","namespace":"openshift-cnv","pos":"kubevirt.go:918","timestamp":"2024-12-01T20:47:45.383726Z","uid":"603b91ee-52a6-4f39-b97e-6dc8b7959943"}
Version-Release number of selected component (if applicable):
OpenShift Virtualization 4.16 to 4.17 upgrade on a HCP baremetal
How reproducible:
Observed in customer's environment
Steps to Reproduce:
1. Install OpenShift Virtualization on hosted control planes on bare metal cluster. 2. Upgrade the hosted cluster. OpenShift Virtualization upgrade gets stuck with above mentioned errors.
Actual results:
OpenShift Virtualization upgrade is stuck on hosted control planes on bare metal
Expected results:
Upgrade should work.
Additional info:
Impacts ROSA-HCP customers on 4.17, attempting to run OpenShift Virtualization, trying to upgrade.
- is duplicated by
-
CNV-52321 Virt 4.17.x fails to create hyperconverged resource on ROSA HCP
- Closed
-
CNV-52789 OpenShift Virtualization fails to install on Hosted Controlplane clusters
- Closed
- is related to
-
OCPBUGS-43749 Control-plane upgrade blocked despite successful Nodepool upgrade in unavailable OpenShift 4.17 clusters
- New
-
OCPBUGS-43648 OpenShift Virtualization (PackageManifest kubevirt-hyperconverged) Not Upgrading When Upgrading from 4.14 to 4.15 or 4.16
- Closed
- links to
- mentioned on