-
Bug
-
Resolution: Done-Errata
-
Critical
-
None
-
False
-
-
False
-
CLOSED
-
---
-
---
-
-
-
CNV Virtualization Sprint 234, CNV Virtualization Sprint 235
-
Urgent
-
None
Description of problem: Started with a OCP 4.12.7/CNV 4.12.2 cluster and upgraded ocp to 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest, but I can no longer get vms to running state on this cluster
Version-Release number of selected component (if applicable):
OCP 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest
CNV 4.12.2
How reproducible:
1/1 time
Steps to Reproduce:
1. Upgrade OCP to latest RCOS image and try to spin up a vm on the same.
2.
3.
Actual results:
=================
[cloud-user@ocp-ipi-executor-xl ~]$ oc get nodes
NAME STATUS ROLES AGE VERSION
c01-dbn-412-4q8rm-master-0 Ready control-plane,master 4d10h v1.26.0+f854081
c01-dbn-412-4q8rm-master-1 Ready control-plane,master 4d10h v1.26.0+f854081
c01-dbn-412-4q8rm-master-2 Ready control-plane,master 4d10h v1.26.0+f854081
c01-dbn-412-4q8rm-worker-0-kpfnp Ready worker 4d9h v1.26.0+f854081
c01-dbn-412-4q8rm-worker-0-n2jzb Ready worker 4d9h v1.26.0+f854081
c01-dbn-412-4q8rm-worker-0-q8w5x Ready worker 4d9h v1.26.0+f854081
[cloud-user@ocp-ipi-executor-xl ~]$ oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest True False 4d3h Cluster version is 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest
[cloud-user@ocp-ipi-executor-xl ~]$
Name space:
============
[cloud-user@ocp-ipi-executor-xl ~]$ oc get namespace node-gather-unprivileged -o yaml
apiVersion: v1
kind: Namespace
metadata:
annotations:
openshift.io/description: ""
openshift.io/display-name: ""
openshift.io/requester: unprivileged-user
openshift.io/sa.scc.mcs: s0:c29,c9
openshift.io/sa.scc.supplemental-groups: 1000830000/10000
openshift.io/sa.scc.uid-range: 1000830000/10000
operator.tekton.dev/prune.hash: e12cf88878007ab90299fa28c92d42daf72a1dda6ff604ea40c1f1da0f1f5e1d
creationTimestamp: "2023-03-13T22:56:06Z"
labels:
kubernetes.io/metadata.name: node-gather-unprivileged
openshift-pipelines.tekton.dev/namespace-reconcile-version: 1.9.2
pod-security.kubernetes.io/enforce: privileged
pod-security.kubernetes.io/enforce-version: v1.24
pod-security.kubernetes.io/warn: privileged
security.openshift.io/scc.podSecurityLabelSync: "false"
name: node-gather-unprivileged
resourceVersion: "7562912"
uid: 1073a32d-b4df-4712-ad19-ad11b03e7006
spec:
finalizers:
- kubernetes
status:
phase: Active
[cloud-user@ocp-ipi-executor-xl ~]$
=============
VM is in CrashLoopBackOff state
[cloud-user@ocp-ipi-executor-xl ~]$ oc get vm -A
NAMESPACE NAME AGE STATUS READY
node-gather-unprivileged must-gather-vm-2-1678748206-157084 2m1s CrashLoopBackOff False
[cloud-user@ocp-ipi-executor-xl ~]$ oc get vm -A -o yaml
apiVersion: v1
items: - apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
annotations:
kubemacpool.io/transaction-timestamp: "2023-03-13T22:56:33.589947468Z"
kubevirt.io/latest-observed-api-version: v1
kubevirt.io/storage-observed-api-version: v1alpha3
creationTimestamp: "2023-03-13T22:56:33Z"
generation: 2
labels:
created-by-dynamic-class-creator: "Yes"
kubevirt.io/vm: must-gather-vm-2
name: must-gather-vm-2-1678748206-157084
namespace: node-gather-unprivileged
resourceVersion: "7567952"
uid: 753e0e37-8ac5-48b7-ae22-d5308d267aa8
spec:
running: true
template:
metadata:
creationTimestamp: null
labels:
kubevirt.io/domain: must-gather-vm-2-1678748206-157084
kubevirt.io/vm: must-gather-vm-2-1678748206-157084
spec:
domain:
cpu:
cores: 1
devices:
disks: - disk:
bus: virtio
name: containerdisk - disk:
bus: virtio
name: cloudinitdisk
interfaces: - macAddress: 02:c7:49:00:00:14
masquerade: {}
name: default - bridge: {}
macAddress: 02:c7:49:00:00:15
name: mg-br1
rng: {}
machine:
type: pc-q35-rhel8.6.0
resources:
requests:
memory: 1Gi
networks: - name: default
pod: {} - multus:
networkName: mg-br1
name: mg-br1
terminationGracePeriodSeconds: 30
volumes: - containerDisk:
image: quay.io/openshift-cnv/qe-cnv-tests-fedora:37@sha256:ca3fa3d34a9c4277916f153b85efb9685abb8248fdbc2eaa511507588ed650db
name: containerdisk - cloudInitNoCloud:
userData: |-
#cloud-config
chpasswd:
expire: false
password: password
user: fedora
ssh_authorized_keys:
[ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCj47ubVnxR16JU7ZfDli3N5QVBAwJBRh2xMryyjk5dtfugo5JIPGB2cyXTqEDdzuRmI+Vkb/A5duJyBRlA+9RndGGmhhMnj8and3wu5/cEb7DkF6ZJ25QV4LQx3K/i57LStUHXRTvruHOZ2nCuVXWqi7wSvz5YcvEv7O8pNF5uGmqHlShBdxQxcjurXACZ1YY0YDJDr3AJai1KF9zehVJODuSbrnOYpThVWGjFuFAnNxbtuZ8EOSougN2aYTf2qr/KFGDHtewIkzZmP6cjzKO5bN3pVbXxmb2Gces/BYHntY4MXBTUqwsmsCRC5SAz14bEP/vsLtrNhjq9vCS+BjMT root@exec1.rdocloud]
runcmd: ['grep ssh-rsa /etc/crypto-policies/back-ends/opensshserver.config || sudo update-crypto-policies --set LEGACY || true', "sudo sed -i 's/^#\\?PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config", 'sudo systemctl enable sshd', 'sudo systemctl restart sshd']
name: cloudinitdisk
status:
conditions:
- lastProbeTime: "2023-03-13T23:00:13Z"
lastTransitionTime: "2023-03-13T23:00:13Z"
message: VMI does not exist
reason: VMINotExists
status: "False"
type: Ready
printableStatus: CrashLoopBackOff
startFailure:
consecutiveFailCount: 4
lastFailedVMIUID: 06e69b60-dc49-43fa-9c08-f6b11a70d525
retryAfterTimestamp: "2023-03-13T23:03:27Z"
volumeSnapshotStatuses: - enabled: false
name: containerdisk
reason: Snapshot is not supported for this volumeSource type [containerdisk] - enabled: false
name: cloudinitdisk
reason: Snapshot is not supported for this volumeSource type [cloudinitdisk]
kind: List
metadata:
resourceVersion: ""
[cloud-user@ocp-ipi-executor-xl ~]$
=============
[cloud-user@ocp-ipi-executor-xl ~]$ oc get vmi -A
No resources found
[cloud-user@ocp-ipi-executor-xl ~]$ oc get pods -n node-gather-unprivileged
No resources found in node-gather-unprivileged namespace.
[cloud-user@ocp-ipi-executor-xl ~]$
Will attach virt controller log.
Expected results:
Able to get vm in running state.
Additional info:
- external trackers