Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-7325

OSUS failed to install on arm node on a multi-arch cluster

XMLWordPrintable

    • None
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      Install updateservice instance failed on a 4.12 multi-arch cluster. Seems like updateservice controller is unable to update the operand pod.
      
      # oc logs updateservice-operator-6bbdd6ccf6-bjhqk
      ......
      
       1.676029608397714e+09   ERROR   controller_updateservice        Failed to update Status {"Request.Namespace"    : "openshift-update-service", "Request.Name": "sample", "error": "Operation cannot be fulfilled on updateservices.updateservice.operator.openshift.io \"sample\": the object has been modified; please apply your changes to the latest version and try again"}
       99 github.com/openshift/cincinnati-operator/controllers.(*UpdateServiceReconciler).Reconcile
      100         /remote-source/app/controllers/updateservice_controller.go:186
      101 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
      102         /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:121
      103 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
      104         /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:320
      105 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
      106         /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:273
      107 sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
      108         /remote-source/app/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:234
      109 1.6760296083978007e+09  INFO    controller_updateservice        Reconciling UpdateService       {"Request.Namespace": "openshift-update-service", "Request.Name": "sample"}
      ......
      
      # oc get pod
      NAME                                      READY   STATUS                  RESTARTS     AGE
      sample-856879d565-k7scs                   0/2     Init:CrashLoopBackOff   1 (1s ago)   6s
      updateservice-operator-6bbdd6ccf6-bjhqk   1/1     Running                 0            21m
      
      
      # oc describe pod/sample-856879d565-k7scs
      ...
      Events:
        Type     Reason          Age                            From               Message
        ----     ------          ----                           ----               -------
        Normal   Scheduled       <invalid>                      default-scheduler  Successfully assigned openshift-update-service/sample-856879d565-k7scs to ip-10-0-136-119.us-east-2.compute.internal
        Normal   AddedInterface  <invalid>                      multus             Add eth0 [10.131.2.10/23] from openshift-sdn
        Normal   Pulled          <invalid>                      kubelet            Successfully pulled image "quay.io/openshifttest/graph-data:5.0.1" in 530.026937ms
        Normal   Pulled          <invalid>                      kubelet            Successfully pulled image "quay.io/openshifttest/graph-data:5.0.1" in 938.053952ms
        Normal   Pulling         <invalid> (x3 over <invalid>)  kubelet            Pulling image "quay.io/openshifttest/graph-data:5.0.1"
        Normal   Created         <invalid> (x3 over <invalid>)  kubelet            Created container graph-data
        Normal   Started         <invalid> (x3 over <invalid>)  kubelet            Started container graph-data
        Warning  BackOff         <invalid> (x3 over <invalid>)  kubelet            Back-off restarting failed container
        Normal   Pulled          <invalid>                      kubelet            Successfully pulled image "quay.io/openshifttest/graph-data:5.0.1" in 584.459518ms
      
      
      

      Version-Release number of selected component (if applicable):

      cincinnati-container-v5.0.1-3
      cincinnati-operator-bundle-container-v5.0.1-1
      cincinnati-operator-container-v5.0.1-3

      How reproducible:

      2/2

      Steps to Reproduce:

      1. Install a 4.12 multi-arch cluster
      2. Install OSUS 5.0.1 on the cluster
      3.
      

      Actual results:

      Graph data failed to start

      Expected results:

      OSUS is installed successfully

      Additional info:

      # oc get clusterversion version -ojson|jq -r '.status.conditions[]|select(.type == "ReleaseAccepted")|.message'
      Payload loaded version="4.12.1" image="quay.io/openshift-release-dev/ocp-release@sha256:9bd2356f06d00d756277a11d32ed89100c30c65dc0a5105196471db3dfe269ff" architecture="multi"
      
      # oc get pod sample-856879d565-k7scs -oyaml
      apiVersion: v1
      kind: Pod
      metadata:
        annotations:
          k8s.v1.cni.cncf.io/network-status: |-
            [{
                "name": "openshift-sdn",
                "interface": "eth0",
                "ips": [
                    "10.131.2.10"
                ],
                "default": true,
                "dns": {}
            }]
          k8s.v1.cni.cncf.io/networks-status: |-
            [{
                "name": "openshift-sdn",
                "interface": "eth0",
                "ips": [
                    "10.131.2.10"
                ],
                "default": true,
                "dns": {}
            }]
          openshift.io/scc: restricted-v2
          seccomp.security.alpha.kubernetes.io/pod: runtime/default
          updateservice.operator.openshift.io/env-config-hash: 9668c9f2de012dc4544b9433a4ea8162d849d350b45362601caae7a66609f933
          updateservice.operator.openshift.io/graph-builder-config-hash: d3b58e50f2493b57b647952c3abcbf3dbbed775c9645ad600ea44a8a60a4db24
        creationTimestamp: "2023-02-10T11:46:48Z"
        generateName: sample-856879d565-
        labels:
          app: sample
          deployment: sample
          pod-template-hash: 856879d565
        name: sample-856879d565-k7scs
        namespace: openshift-update-service
        ownerReferences:
        - apiVersion: apps/v1
          blockOwnerDeletion: true
          controller: true
          kind: ReplicaSet
          name: sample-856879d565
          uid: f5866fb0-be7b-47ba-b995-16dde6d8658a
        resourceVersion: "100513"
        uid: 27839046-b295-4049-91ad-2837071f1196
      spec:
        containers:
        - args:
          - -c
          - /etc/configs/gb.toml
          command:
          - /usr/bin/graph-builder
          env:
          - name: RUST_BACKTRACE
            valueFrom:
              configMapKeyRef:
                key: gb.rust_backtrace
                name: sample-env
          image: registry.redhat.io/openshift-update-service/openshift-update-service-rhel8@sha256:e1f2095b56a9c942906510a988af30b3bf9537e5de5cc247c0f8e77ce8b9fc3f
          imagePullPolicy: IfNotPresent
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /liveness
              port: 9080
              scheme: HTTP
            initialDelaySeconds: 3
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 3
          name: graph-builder
          ports:
          - containerPort: 8080
            name: graph-builder
            protocol: TCP
          - containerPort: 9080
            name: status-gb
            protocol: TCP
          readinessProbe:
            failureThreshold: 3
            httpGet:
              path: /readiness
              port: 9080
              scheme: HTTP
            initialDelaySeconds: 3
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 3
          resources:
            limits:
              cpu: 750m
              memory: 512Mi
            requests:
              cpu: 150m
              memory: 64Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            runAsNonRoot: true
            runAsUser: 1000670000
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /etc/configs
            name: configs
            readOnly: true
          - mountPath: /var/lib/cincinnati/graph-data
            name: cincinnati-graph-data
          - mountPath: /var/lib/cincinnati/registry-credentials
            name: pull-secret
            readOnly: true
          - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
            name: kube-api-access-j4lgq
            readOnly: true
        - args:
          - -$(PE_LOG_VERBOSITY)
          - --service.address
          - $(ADDRESS)
          - --service.mandatory_client_parameters
          - $(PE_MANDATORY_CLIENT_PARAMETERS)
          - --service.path_prefix
          - /api/upgrades_info
          - --service.port
          - "8081"
          - --status.address
          - $(PE_STATUS_ADDRESS)
          - --status.port
          - "9081"
          - --upstream.cincinnati.url
          - $(UPSTREAM)
          command:
          - /usr/bin/policy-engine
          env:
          - name: ADDRESS
            valueFrom:
              configMapKeyRef:
                key: pe.address
                name: sample-env
          - name: PE_STATUS_ADDRESS
            valueFrom:
              configMapKeyRef:
                key: pe.status.address
                name: sample-env
          - name: UPSTREAM
            valueFrom:
              configMapKeyRef:
                key: pe.upstream
                name: sample-env
          - name: PE_LOG_VERBOSITY
            valueFrom:
              configMapKeyRef:
                key: pe.log.verbosity
                name: sample-env
          - name: PE_MANDATORY_CLIENT_PARAMETERS
            valueFrom:
              configMapKeyRef:
                key: pe.mandatory_client_parameters
                name: sample-env
          - name: RUST_BACKTRACE
            valueFrom:
              configMapKeyRef:
                key: pe.rust_backtrace
                name: sample-env
          image: registry.redhat.io/openshift-update-service/openshift-update-service-rhel8@sha256:e1f2095b56a9c942906510a988af30b3bf9537e5de5cc247c0f8e77ce8b9fc3f
          imagePullPolicy: IfNotPresent
          livenessProbe:
            failureThreshold: 3
            httpGet:
              path: /metrics
              port: 9081
              scheme: HTTP
            initialDelaySeconds: 3
            periodSeconds: 10
            successThreshold: 1
            timeoutSeconds: 3
          name: policy-engine
          ports:
          - containerPort: 8081
            name: policy-engine
            protocol: TCP
          - containerPort: 9081
            name: status-pe
            protocol: TCP
          resources:
            limits:
              cpu: 750m
              memory: 512Mi
            requests:
              cpu: 150m
              memory: 64Mi
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            runAsNonRoot: true
            runAsUser: 1000670000
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
            name: kube-api-access-j4lgq
            readOnly: true
        dnsPolicy: ClusterFirst
        enableServiceLinks: true
        imagePullSecrets:
        - name: default-dockercfg-p2jsz
        initContainers:
        - image: quay.io/openshifttest/graph-data:5.0.1
          imagePullPolicy: Always
          name: graph-data
          resources: {}
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
              - ALL
            runAsNonRoot: true
            runAsUser: 1000670000
          terminationMessagePath: /dev/termination-log
          terminationMessagePolicy: File
          volumeMounts:
          - mountPath: /var/lib/cincinnati/graph-data
            name: cincinnati-graph-data
          - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
            name: kube-api-access-j4lgq
            readOnly: true
        nodeName: ip-10-0-136-119.us-east-2.compute.internal
        preemptionPolicy: PreemptLowerPriority
        priority: 0
        restartPolicy: Always
        schedulerName: default-scheduler
        securityContext:
          fsGroup: 1000670000
          seLinuxOptions:
            level: s0:c26,c10
          seccompProfile:
            type: RuntimeDefault
        serviceAccount: default
        serviceAccountName: default
        terminationGracePeriodSeconds: 30
        tolerations:
        - effect: NoExecute
          key: node.kubernetes.io/not-ready
          operator: Exists
          tolerationSeconds: 300
        - effect: NoExecute
          key: node.kubernetes.io/unreachable
          operator: Exists
          tolerationSeconds: 300
        - effect: NoSchedule
          key: node.kubernetes.io/memory-pressure
          operator: Exists
        volumes:
        - configMap:
            defaultMode: 420
            name: sample-config
          name: configs
        - emptyDir: {}
          name: cincinnati-graph-data
        - name: pull-secret
          secret:
            defaultMode: 420
            secretName: sample-pull-secret
        - name: kube-api-access-j4lgq
          projected:
            defaultMode: 420
            sources:
            - serviceAccountToken:
                expirationSeconds: 3607
                path: token
            - configMap:
                items:
                - key: ca.crt
                  path: ca.crt
                name: kube-root-ca.crt
            - downwardAPI:
                items:
                - fieldRef:
                    apiVersion: v1
                    fieldPath: metadata.namespace
                  path: namespace
            - configMap:
                items:
                - key: service-ca.crt
                  path: service-ca.crt
                name: openshift-service-ca.crt
      status:
        conditions:
        - lastProbeTime: null
          lastTransitionTime: "2023-02-10T11:46:48Z"
          message: 'containers with incomplete status: [graph-data]'
          reason: ContainersNotInitialized
          status: "False"
          type: Initialized
        - lastProbeTime: null
          lastTransitionTime: "2023-02-10T11:46:48Z"
          message: 'containers with unready status: [graph-builder policy-engine]'
          reason: ContainersNotReady
          status: "False"
          type: Ready
        - lastProbeTime: null
          lastTransitionTime: "2023-02-10T11:46:48Z"
          message: 'containers with unready status: [graph-builder policy-engine]'
          reason: ContainersNotReady
          status: "False"
          type: ContainersReady
        - lastProbeTime: null
          lastTransitionTime: "2023-02-10T11:46:48Z"
          status: "True"
          type: PodScheduled
        containerStatuses:
        - image: registry.redhat.io/openshift-update-service/openshift-update-service-rhel8@sha256:e1f2095b56a9c942906510a988af30b3bf9537e5de5cc247c0f8e77ce8b9fc3f
          imageID: ""
          lastState: {}
          name: graph-builder
          ready: false
          restartCount: 0
          started: false
          state:
            waiting:
              reason: PodInitializing
        - image: registry.redhat.io/openshift-update-service/openshift-update-service-rhel8@sha256:e1f2095b56a9c942906510a988af30b3bf9537e5de5cc247c0f8e77ce8b9fc3f
          imageID: ""
          lastState: {}
          name: policy-engine
          ready: false
          restartCount: 0
          started: false
          state:
            waiting:
              reason: PodInitializing
        hostIP: 10.0.136.119
        initContainerStatuses:
        - containerID: cri-o://9e1c338010c6cb3c8bfebc47a5636223e17cac79b9ab0103d239d54f06abda40
          image: quay.io/openshifttest/graph-data:5.0.1
          imageID: quay.io/openshifttest/graph-data@sha256:1e5c1202d6829f76b16baad244b2e6ffa21e9759c62ed772e1b0ff2caa9640bb
          lastState:
            terminated:
              containerID: cri-o://9e1c338010c6cb3c8bfebc47a5636223e17cac79b9ab0103d239d54f06abda40
              exitCode: 1
              finishedAt: "2023-02-10T12:12:58Z"
              reason: Error
              startedAt: "2023-02-10T12:12:58Z"
          name: graph-data
          ready: false
          restartCount: 10
          state:
            waiting:
              message: back-off 5m0s restarting failed container=graph-data pod=sample-856879d565-k7scs_openshift-update-service(27839046-b295-4049-91ad-2837071f1196)
              reason: CrashLoopBackOff
        phase: Pending
        podIP: 10.131.2.10
        podIPs:
        - ip: 10.131.2.10
        qosClass: Burstable
        startTime: "2023-02-10T11:46:48Z"

              lmohanty@redhat.com Lalatendu Mohanty
              yanyang@redhat.com Yang Yang
              Yang Yang Yang Yang
              Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

                Created:
                Updated:
                Resolved: