Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-26845

[2177924] After upgrading ocp to 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest, can't bring vms up to running state

XMLWordPrintable

    • CNV Virtualization Sprint 234, CNV Virtualization Sprint 235
    • Urgent
    • None

      Description of problem: Started with a OCP 4.12.7/CNV 4.12.2 cluster and upgraded ocp to 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest, but I can no longer get vms to running state on this cluster

      Version-Release number of selected component (if applicable):
      OCP 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest
      CNV 4.12.2

      How reproducible:
      1/1 time

      Steps to Reproduce:
      1. Upgrade OCP to latest RCOS image and try to spin up a vm on the same.
      2.
      3.

      Actual results:
      =================
      [cloud-user@ocp-ipi-executor-xl ~]$ oc get nodes
      NAME STATUS ROLES AGE VERSION
      c01-dbn-412-4q8rm-master-0 Ready control-plane,master 4d10h v1.26.0+f854081
      c01-dbn-412-4q8rm-master-1 Ready control-plane,master 4d10h v1.26.0+f854081
      c01-dbn-412-4q8rm-master-2 Ready control-plane,master 4d10h v1.26.0+f854081
      c01-dbn-412-4q8rm-worker-0-kpfnp Ready worker 4d9h v1.26.0+f854081
      c01-dbn-412-4q8rm-worker-0-n2jzb Ready worker 4d9h v1.26.0+f854081
      c01-dbn-412-4q8rm-worker-0-q8w5x Ready worker 4d9h v1.26.0+f854081
      [cloud-user@ocp-ipi-executor-xl ~]$ oc get clusterversion
      NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
      version 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest True False 4d3h Cluster version is 4.13.0-0.ci.test-2023-03-02-214629-ci-ln-xw171tb-latest
      [cloud-user@ocp-ipi-executor-xl ~]$
      Name space:
      ============
      [cloud-user@ocp-ipi-executor-xl ~]$ oc get namespace node-gather-unprivileged -o yaml
      apiVersion: v1
      kind: Namespace
      metadata:
      annotations:
      openshift.io/description: ""
      openshift.io/display-name: ""
      openshift.io/requester: unprivileged-user
      openshift.io/sa.scc.mcs: s0:c29,c9
      openshift.io/sa.scc.supplemental-groups: 1000830000/10000
      openshift.io/sa.scc.uid-range: 1000830000/10000
      operator.tekton.dev/prune.hash: e12cf88878007ab90299fa28c92d42daf72a1dda6ff604ea40c1f1da0f1f5e1d
      creationTimestamp: "2023-03-13T22:56:06Z"
      labels:
      kubernetes.io/metadata.name: node-gather-unprivileged
      openshift-pipelines.tekton.dev/namespace-reconcile-version: 1.9.2
      pod-security.kubernetes.io/enforce: privileged
      pod-security.kubernetes.io/enforce-version: v1.24
      pod-security.kubernetes.io/warn: privileged
      security.openshift.io/scc.podSecurityLabelSync: "false"
      name: node-gather-unprivileged
      resourceVersion: "7562912"
      uid: 1073a32d-b4df-4712-ad19-ad11b03e7006
      spec:
      finalizers:

      • kubernetes
        status:
        phase: Active
        [cloud-user@ocp-ipi-executor-xl ~]$
        =============
        VM is in CrashLoopBackOff state
        [cloud-user@ocp-ipi-executor-xl ~]$ oc get vm -A
        NAMESPACE NAME AGE STATUS READY
        node-gather-unprivileged must-gather-vm-2-1678748206-157084 2m1s CrashLoopBackOff False
        [cloud-user@ocp-ipi-executor-xl ~]$ oc get vm -A -o yaml
        apiVersion: v1
        items:
      • apiVersion: kubevirt.io/v1
        kind: VirtualMachine
        metadata:
        annotations:
        kubemacpool.io/transaction-timestamp: "2023-03-13T22:56:33.589947468Z"
        kubevirt.io/latest-observed-api-version: v1
        kubevirt.io/storage-observed-api-version: v1alpha3
        creationTimestamp: "2023-03-13T22:56:33Z"
        generation: 2
        labels:
        created-by-dynamic-class-creator: "Yes"
        kubevirt.io/vm: must-gather-vm-2
        name: must-gather-vm-2-1678748206-157084
        namespace: node-gather-unprivileged
        resourceVersion: "7567952"
        uid: 753e0e37-8ac5-48b7-ae22-d5308d267aa8
        spec:
        running: true
        template:
        metadata:
        creationTimestamp: null
        labels:
        kubevirt.io/domain: must-gather-vm-2-1678748206-157084
        kubevirt.io/vm: must-gather-vm-2-1678748206-157084
        spec:
        domain:
        cpu:
        cores: 1
        devices:
        disks:
      • disk:
        bus: virtio
        name: containerdisk
      • disk:
        bus: virtio
        name: cloudinitdisk
        interfaces:
      • macAddress: 02:c7:49:00:00:14
        masquerade: {}
        name: default
      • bridge: {}
        macAddress: 02:c7:49:00:00:15
        name: mg-br1
        rng: {}
        machine:
        type: pc-q35-rhel8.6.0
        resources:
        requests:
        memory: 1Gi
        networks:
      • name: default
        pod: {}
      • multus:
        networkName: mg-br1
        name: mg-br1
        terminationGracePeriodSeconds: 30
        volumes:
      • containerDisk:
        image: quay.io/openshift-cnv/qe-cnv-tests-fedora:37@sha256:ca3fa3d34a9c4277916f153b85efb9685abb8248fdbc2eaa511507588ed650db
        name: containerdisk
      • cloudInitNoCloud:
        userData: |-
        #cloud-config
        chpasswd:
        expire: false
        password: password
        user: fedora

      ssh_authorized_keys:
      [ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCj47ubVnxR16JU7ZfDli3N5QVBAwJBRh2xMryyjk5dtfugo5JIPGB2cyXTqEDdzuRmI+Vkb/A5duJyBRlA+9RndGGmhhMnj8and3wu5/cEb7DkF6ZJ25QV4LQx3K/i57LStUHXRTvruHOZ2nCuVXWqi7wSvz5YcvEv7O8pNF5uGmqHlShBdxQxcjurXACZ1YY0YDJDr3AJai1KF9zehVJODuSbrnOYpThVWGjFuFAnNxbtuZ8EOSougN2aYTf2qr/KFGDHtewIkzZmP6cjzKO5bN3pVbXxmb2Gces/BYHntY4MXBTUqwsmsCRC5SAz14bEP/vsLtrNhjq9vCS+BjMT root@exec1.rdocloud]
      runcmd: ['grep ssh-rsa /etc/crypto-policies/back-ends/opensshserver.config || sudo update-crypto-policies --set LEGACY || true', "sudo sed -i 's/^#\\?PasswordAuthentication no/PasswordAuthentication yes/g' /etc/ssh/sshd_config", 'sudo systemctl enable sshd', 'sudo systemctl restart sshd']
      name: cloudinitdisk
      status:
      conditions:

      • lastProbeTime: "2023-03-13T23:00:13Z"
        lastTransitionTime: "2023-03-13T23:00:13Z"
        message: VMI does not exist
        reason: VMINotExists
        status: "False"
        type: Ready
        printableStatus: CrashLoopBackOff
        startFailure:
        consecutiveFailCount: 4
        lastFailedVMIUID: 06e69b60-dc49-43fa-9c08-f6b11a70d525
        retryAfterTimestamp: "2023-03-13T23:03:27Z"
        volumeSnapshotStatuses:
      • enabled: false
        name: containerdisk
        reason: Snapshot is not supported for this volumeSource type [containerdisk]
      • enabled: false
        name: cloudinitdisk
        reason: Snapshot is not supported for this volumeSource type [cloudinitdisk]
        kind: List
        metadata:
        resourceVersion: ""
        [cloud-user@ocp-ipi-executor-xl ~]$
        =============
        [cloud-user@ocp-ipi-executor-xl ~]$ oc get vmi -A
        No resources found
        [cloud-user@ocp-ipi-executor-xl ~]$ oc get pods -n node-gather-unprivileged
        No resources found in node-gather-unprivileged namespace.
        [cloud-user@ocp-ipi-executor-xl ~]$
        Will attach virt controller log.
        Expected results:
        Able to get vm in running state.

      Additional info:

              lpivarc Luboslav Pivarc
              rhn-support-dbasunag Debarati Basu-Nag
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Created:
                Updated:
                Resolved: