Uploaded image for project: 'OpenShift Virtualization'
  1. OpenShift Virtualization
  2. CNV-21418

[2128997] [4.11.1]virt-launcher cannot be started on OCP 4.12 due to PodSecurity restricted:v1.24

    XMLWordPrintable

Details

    • CNV Virtualization Sprint 226
    • Urgent

    Description

      +++ This bug was initially created as a clone of Bug #2119128 +++

      Description of problem:
      virt-launcher pod on a regular namespace (not named openshift-*) cannot be started on OCP 4.12.

      On the status on the VMI we see:
      status:
      conditions:

      • lastProbeTime: "2022-08-15T06:40:54Z"
        lastTransitionTime: "2022-08-15T06:40:54Z"
        message: virt-launcher pod has not yet been scheduled
        reason: PodNotExists
        status: "False"
        type: Ready
      • lastProbeTime: null
        lastTransitionTime: "2022-08-15T06:40:54Z"
        message: 'failed to create virtual machine pod: pods "virt-launcher-testvm-6xks9"
        is forbidden: violates PodSecurity "restricted:v1.24": seLinuxOptions (pod and
        container "volumecontainerdisk" set forbidden securityContext.seLinuxOptions:
        type "virt_launcher.process"), allowPrivilegeEscalation != false (containers
        "container-disk-binary", "volumecontainerdisk-init", "compute", "volumecontainerdisk"
        must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities
        (containers "container-disk-binary", "volumecontainerdisk-init", "compute",
        "volumecontainerdisk" must set securityContext.capabilities.drop=["ALL"]; container
        "compute" must not include "SYS_NICE" in securityContext.capabilities.add),
        runAsNonRoot != true (pod or containers "container-disk-binary", "compute" must
        set securityContext.runAsNonRoot=true), runAsUser=0 (pod and containers "container-disk-binary",
        "compute" must not set runAsUser=0), seccompProfile (pod or containers "container-disk-binary",
        "volumecontainerdisk-init", "compute", "volumecontainerdisk" must set securityContext.seccompProfile.type
        to "RuntimeDefault" or "Localhost")'
        reason: FailedCreate
        status: "False"
        type: Synchronized
        created: true
        printableStatus: Starting

      Version-Release number of selected component (if applicable):
      CNV 4.12 on OCP 4.12

      How reproducible:
      we are constantly getting this on CI on OCP 4.12

      Steps to Reproduce:
      1. try to start a VM on OCP 4.12
      2.
      3.

      Actual results:
      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.12-e2e-azure-upgrade-cnv/1559052348230209536/artifacts/e2e-azure-upgrade-cnv/test/artifacts/cnv-must-gather-vms/registry-redhat-io-container-native-virtualization-cnv-must-gather-rhel8-sha256-37a2b2f102544ec8e953b473f85505e1d999aa5fde09e1385ebfa365fc4aa732/namespaces/vmsns/kubevirt.io/virtualmachines/custom/testvm.yaml

      Expected results:
      no PodSecurity related error for virt-launcher

      Additional info:
      All the logs are available here:
      https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/periodic-ci-openshift-release-master-nightly-4.12-e2e-azure-upgrade-cnv/1559052348230209536/artifacts/e2e-azure-upgrade-cnv/test/artifacts/cnv-must-gather-vms/

      — Additional comment from on 2022-08-19 10:12:10 UTC —

      Hi Simone,
      Are you using Openshift-cnv for this test? I see that your workload is running under root which is definitely wrong(different problem but I want to make sure...)

      — Additional comment from Simone Tiraboschi on 2022-08-22 07:11:31 UTC —

      Yes, it's a CI lane testing the latest OCP compose with up to date CNV.
      We user an intermediate repo here: https://github.com/openshift-cnv/cnv-ci

      — Additional comment from on 2022-08-22 13:32:17 UTC —

      "compute" must not include "SYS_NICE" & "runAsUser=0 (pod and containers "container-disk-binary","compute" must not set runAsUser=0)" suggest that we are not running rootless implementation. Can you please verify this? It would be a much bigger problem.

      — Additional comment from on 2022-08-23 17:34:09 UTC —

      Proposing this as a blocker.

      — Additional comment from on 2022-08-23 17:52:54 UTC —

      Rationale for proposing this as a blocker is that it doesn't just affect testing. VMs really are disrupted.

      — Additional comment from Simone Tiraboschi on 2022-08-25 14:30:00 UTC —

      (In reply to lpivarc from comment #3)
      > "compute" must not include "SYS_NICE" & "runAsUser=0 (pod and containers
      > "container-disk-binary","compute" must not set runAsUser=0)" suggest that we
      > are not running rootless implementation. Can you please verify this? It
      > would be a much bigger problem.

      That specific lane is an upgrade one.
      It's testing the latest CNV version (v4.10.4) to the candidate one for this specific OCP version (so CNV v4.12.0 on OCP v4.12).

      AFAIK we are setting NonRoot FG starting with CNV 4.11.0 but not on CNV 4.10.z.
      So potentially we will not see this in CNV 4.11.z -> CNV 4.12.z is all the VMs runs with rootless.

      But please be aware that this is still going to affect EUS to EUS upgrades (4.10 -> 4.12 is the first one for us) where the cluster admin is supposed to pause the workers MachineConfigPool to skips node reboots on 4.11.
      See:
      https://issues.redhat.com/browse/CNV-13992

      — Additional comment from Antonio Cardace on 2022-09-22 08:58:53 UTC —

      While Lubo merged the required changes in Kubevirt we're still missing a change in HCO to enable PSA feature gate by default, @stirabos@redhat.com can you comment here when you post the HCO PR?

      Attachments

        Issue Links

          Activity

            People

              lpivarc Luboslav Pivarc
              acardace@redhat.com Antonio Cardace
              Akriti gupta Akriti gupta
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: