Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32230

OpenShift update is not starting due to an existing custom SCC

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Duplicate
    • Icon: Undefined Undefined
    • None
    • 4.12.0
    • None
    • Important
    • No
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      A custom SCC prevents the update from running because the 'version-<new-release>-#####-#####' pod is created with this SCC.
      
      

      Version-Release number of selected component (if applicable):

      Detected performing an update from 4.12.0

      How reproducible:

      Always

      Steps to Reproduce:

      1 - Create the following SCC on a 4.12.0 cluster (this is the version tested on lab):

      allowHostDirVolumePlugin: true
      allowHostIPC: true
      allowHostNetwork: true
      allowHostPID: true
      allowHostPorts: false
      allowPrivilegeEscalation: true
      allowPrivilegedContainer: true
      allowedCapabilities: null
      apiVersion: security.openshift.io/v1
      defaultAddCapabilities: null
      fsGroup:
       type: RunAsAny
      groups: []
      kind: SecurityContextConstraints
      metadata:
        annotations:
          kubernetes.io/description: trident-controller is a clone of the privileged built-in,
            and is meant just for use with trident.
        creationTimestamp: "2023-01-13T12:23:44Z"
        generation: 2
        labels:
          app: controller.csi.trident.netapp.io
        name: trident-controller
        resourceVersion: "30699854"
        uid: 7d6ffcba-0221-4886-ad77-b8c739884d43
      priority: null
      readOnlyRootFilesystem: false
      requiredDropCapabilities: null
      runAsUser:
        type: MustRunAsNonRoot
      seLinuxContext:
        type: RunAsAny
      supplementalGroups:
        type: RunAsAny
      users:
      - system:serviceaccount:trident:trident-controller
      volumes:
      - downwardAPI
      - emptyDir
      - hostPath
      - projected
       

      2 - Check the existing SCCs:

      oc get scc
      NAME                              PRIV    CAPS                   SELINUX     RUNASUSER          FSGROUP     SUPGROUP    PRIORITY     READONLYROOTFS   VOLUMES
      anyuid                            false   <no value>             MustRunAs   RunAsAny           RunAsAny    RunAsAny    10           false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      hostaccess                        false   <no value>             MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","hostPath","persistentVolumeClaim","projected","secret"]
      hostmount-anyuid                  false   <no value>             MustRunAs   RunAsAny           RunAsAny    RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","hostPath","nfs","persistentVolumeClaim","projected","secret"]
      hostnetwork                       false   <no value>             MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      hostnetwork-v2                    false   ["NET_BIND_SERVICE"]   MustRunAs   MustRunAsRange     MustRunAs   MustRunAs   <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      machine-api-termination-handler   false   <no value>             MustRunAs   RunAsAny           MustRunAs   MustRunAs   <no value>   false            ["downwardAPI","hostPath"]
      node-exporter                     true    <no value>             RunAsAny    RunAsAny           RunAsAny    RunAsAny    <no value>   false            ["*"]
      nonroot                           false   <no value>             MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      nonroot-v2                        false   ["NET_BIND_SERVICE"]   MustRunAs   MustRunAsNonRoot   RunAsAny    RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      privileged                        true    ["*"]                  RunAsAny    RunAsAny           RunAsAny    RunAsAny    <no value>   false            ["*"]
      restricted                        false   <no value>             MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      restricted-v2                     false   ["NET_BIND_SERVICE"]   MustRunAs   MustRunAsRange     MustRunAs   RunAsAny    <no value>   false            ["configMap","downwardAPI","emptyDir","ephemeral","persistentVolumeClaim","projected","secret"]
      trident-controller                true    <no value>             RunAsAny    MustRunAsNonRoot   RunAsAny    RunAsAny    <no value>   false            ["downwardAPI","emptyDir","hostPath","projected"] 

      3 - Run an update:

      $ oc adm upgrade --to=4.12.27 

      4   Check that the 'version*' pod on 'openshift-cluster-version' namespace is on 'Init:CreateContainerConfigError' status and that has been created with the custom SCC . Check the events on this namespace:

      $ oc get pods -n openshift-cluster-version
      NAME                READY   STATUS                            RESTARTS   AGE
      cluster-version-operator-7b5c84fddc-pmgjz   1/1     Running                           0          27m
      version-4.12.27-t7l9k-ddfcj                 0/1     Init:CreateContainerConfigError   0          16s
      
      $ oc get pod version-4.12.27-t7l9k-ddfcj -n openshift-cluster-version -o yaml | grep scc                                                                     
          openshift.io/scc: trident-controller
      
      $ oc get events -n openshift-cluster-version
      [...]
      100s        Normal    SuccessfulCreate    job/version-4.12.27-t7l9k                        Created pod: version-4.12.27-t7l9k-ddfcj
      100s        Normal    RetrievePayload     clusterversion/version                           Retrieving and verifying payload version="4.12.27" image="quay.io/openshift-release-dev/ocp-release@sha256:e15e52f22247b833d1db59b1507fa67d920e39b75297bc3a74f3f15e560d6d02"
      99s         Normal    Pulling             pod/version-4.12.27-t7l9k-ddfcj                  Pulling image "quay.io/openshift-release-dev/ocp-release@sha256:e15e52f22247b833d1db59b1507fa67d920e39b75297bc3a74f3f15e560d6d02"
      99s         Normal    AddedInterface      pod/version-4.12.27-t7l9k-ddfcj                  Add eth0 [10.130.0.33/23] from ovn-kubernetes
      7s          Normal    Pulled              pod/version-4.12.27-t7l9k-ddfcj                  Container image "quay.io/openshift-release-dev/ocp-release@sha256:e15e52f22247b833d1db59b1507fa67d920e39b75297bc3a74f3f15e560d6d02" already present on machine
      7s          Warning   Failed              pod/version-4.12.27-t7l9k-ddfcj                  Error: container has runAsNonRoot and image will run as root (pod: "version-4.12.27-t7l9k-ddfcj_openshift-cluster-version(78cc5545-fb00-43cd-a0ed-5ad9ec8646e8)", container: cleanup)
      85s         Normal    Pulled              pod/version-4.12.27-t7l9k-ddfcj                  Successfully pulled image "quay.io/openshift-release-dev/ocp-release@sha256:e15e52f22247b833d1db59b1507fa67d920e39b75297bc3a74f3f15e560d6d02" in 14.014159242s
      
      [...]
       

      5 - Tested workarounds:

          1. Delete the custom SCC and recreate it after the update is performed.
          2. Edit the custom SCC and set 'runAsUser': to 'RunAsAny' instead of 'MustRunAsNonRoot'.

      Actual results:

      The update is not progressing.

      Expected results:

      The update process should be able to select the expected default SCC

      Additional info:

      Similar issue has been reported in solution 6969777 BZ 2110590, but in this case the issue is caused by the 'runAsUser' parameter.

              Unassigned Unassigned
              rhn-support-malonso Maria Del Mar Alonso
              Jia Liu Jia Liu
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

                Created:
                Updated:
                Resolved: