Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-18578

tuned pods continue to have single arch image post --to-multi-arch migration

XMLWordPrintable

    • Icon: Bug Bug
    • Resolution: Done-Errata
    • Icon: Undefined Undefined
    • None
    • 4.14.0
    • Node Tuning Operator
    • No
    • False
    • Hide

      None

      Show
      None

      This is a clone of issue OCPBUGS-18480. The following is the description of the original issue:

      Description of problem:

      The test is to create cluster on amd64 with single arch payload and migrate to  multi-architecture manifest-listed payload and add ppc64le nodes to the cluster. The tuned pods for openshift-cluster-node-tuning-operator continue to be single arch post migration. The added ppc64le nodes also have tuned pods crashing due to single arch amd64 image.

      Version-Release number of selected component (if applicable):

      # oc get clusterversion
      NAME      VERSION       AVAILABLE   PROGRESSING   SINCE   STATUS
      version   4.14.0-ec.4   True        False         4d      Error while reconciling 4.14.0-ec.4: the cluster operator image-registry is degraded

      Steps to Reproduce:

      1. Install Openshift 4.14.0-ec.4 single arch (amd64) [must be a stable release)
      2. Migrate to multi-arch payload using "oc adm upgrade channel candidate-4.14 && oc adm upgrade --to-multi-arch" 
      3. Once cluster is upgraded, add ppc64le nodes to the cluster.
      
      # oc get nodes
      NAME                                            STATUS   ROLES                  AGE    VERSION
      rdr-800-syd-vpc-k5b4m-master-0                  Ready    control-plane,master   4d2h   v1.27.3+4aaeaec
      rdr-800-syd-vpc-k5b4m-master-1                  Ready    control-plane,master   4d2h   v1.27.3+4aaeaec
      rdr-800-syd-vpc-k5b4m-master-2                  Ready    control-plane,master   4d2h   v1.27.3+4aaeaec
      rdr-800-syd-vpc-k5b4m-worker-1-6mmh9            Ready    worker                 4d1h   v1.27.3+4aaeaec
      rdr-800-syd-vpc-k5b4m-worker-1-kr9nk            Ready    worker                 4d1h   v1.27.3+4aaeaec
      syd05-worker-0.test-ocp-33ab.example.internal   Ready    worker                 40h    v1.27.3+4aaeaec
      syd05-worker-1.test-ocp-33ab.example.internal   Ready    worker                 40h    v1.27.3+4aaeaec

      Actual results:

      After adding the ppc64le nodes, the tuned pods crash as it contains only the amd64 image payload. The rest of the pods on the nodes get created successfully.

      # oc get po -A -owide | grep tuned | grep Crash
      openshift-cluster-node-tuning-operator             tuned-ldt8m                                                     0/1     CrashLoopBackOff   482 (26s ago)    40h    192.168.200.12   syd05-worker-1.test-ocp-33ab.example.internal   <none>           <none>
      openshift-cluster-node-tuning-operator             tuned-mkwmx                                                     0/1     CrashLoopBackOff   481 (4m ago)     40h    192.168.200.13   syd05-worker-0.test-ocp-33ab.example.internal   <none>           <none>

      Expected results:

      Expected that the tuned pod would have multi images and would run successfully.

      Additional info:

      # oc describe pod tuned-ldt8m -n openshift-cluster-node-tuning-operator
      Name:                 tuned-ldt8m
      Namespace:            openshift-cluster-node-tuning-operator
      Priority:             2000001000
      Priority Class Name:  system-node-critical
      Service Account:      tuned
      Node:                 syd05-worker-1.test-ocp-33ab.example.internal/192.168.200.12
      Start Time:           Fri, 01 Sep 2023 09:15:58 -0700
      Labels:               controller-revision-hash=df55cd6f9
                            openshift-app=tuned
                            pod-template-generation=1
      Annotations:          openshift.io/scc: privileged
      Status:               Running
      IP:                   192.168.200.12
      IPs:
        IP:           192.168.200.12
      Controlled By:  DaemonSet/tuned
      Containers:
        tuned:
          Container ID:  cri-o://6853091d5272af7d9d4a0a9f207282a46c93a4f9dc536a99c344f27dc8c932d3
          Image:         quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
          Image ID:      quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
          Port:          <none>
          Host Port:     <none>
          Command:
            /var/lib/tuned/bin/run
            start
          State:       Waiting
            Reason:    CrashLoopBackOff
          Last State:  Terminated
            Reason:    Error
            Message:   exec /var/lib/tuned/bin/run: exec format error      Exit Code:    1
            Started:      Sun, 03 Sep 2023 02:02:03 -0700
            Finished:     Sun, 03 Sep 2023 02:02:03 -0700
          Ready:          False
          Restart Count:  483
          Requests:
            cpu:     10m
            memory:  50Mi
          Environment:
            WATCH_NAMESPACE:  openshift-cluster-node-tuning-operator (v1:metadata.namespace)
            OCP_NODE_NAME:     (v1:spec.nodeName)
            RESYNC_PERIOD:    600
            RELEASE_VERSION:  4.14.0-ec.4
          Mounts:
            /etc/modprobe.d from etc-modprobe-d (rw)
            /etc/sysconfig from etc-sysconfig (rw)
            /etc/sysctl.conf from etc-sysctl-conf (ro)
            /etc/sysctl.d from etc-sysctl-d (ro)
            /etc/systemd from etc-systemd (rw)
            /host from host (rw)
            /lib/modules from lib-modules (ro)
            /run from run (rw)
            /sys from sys (rw)
            /var/lib/tuned/profiles-data from var-lib-tuned-profiles-data (rw)
            /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-plcsk (ro)
      Conditions:
        Type              Status
        Initialized       True
        Ready             False
        ContainersReady   False
        PodScheduled      True
      Volumes:
        etc-modprobe-d:
          Type:          HostPath (bare host directory volume)
          Path:          /etc/modprobe.d
          HostPathType:  Directory
        etc-sysconfig:
          Type:          HostPath (bare host directory volume)
          Path:          /etc/sysconfig
          HostPathType:  Directory
        etc-sysctl-d:
          Type:          HostPath (bare host directory volume)
          Path:          /etc/sysctl.d
          HostPathType:  Directory
        etc-sysctl-conf:
          Type:          HostPath (bare host directory volume)
          Path:          /etc/sysctl.conf
          HostPathType:  File
        etc-systemd:
          Type:          HostPath (bare host directory volume)
          Path:          /etc/systemd
          HostPathType:  Directory
        run:
          Type:          HostPath (bare host directory volume)
          Path:          /run
          HostPathType:  Directory
        sys:
          Type:          HostPath (bare host directory volume)
          Path:          /sys
          HostPathType:  Directory
        lib-modules:
          Type:          HostPath (bare host directory volume)
          Path:          /lib/modules
          HostPathType:  Directory
        host:
          Type:          HostPath (bare host directory volume)
          Path:          /
          HostPathType:  Directory
        var-lib-tuned-profiles-data:
          Type:      ConfigMap (a volume populated by a ConfigMap)
          Name:      tuned-profiles
          Optional:  true
        kube-api-access-plcsk:
          Type:                    Projected (a volume that contains injected data from multiple sources)
          TokenExpirationSeconds:  3607
          ConfigMapName:           kube-root-ca.crt
          ConfigMapOptional:       <nil>
          DownwardAPI:             true
          ConfigMapName:           openshift-service-ca.crt
          ConfigMapOptional:       <nil>
      QoS Class:                   Burstable
      Node-Selectors:              kubernetes.io/os=linux
      Tolerations:                 op=Exists
      Events:
        Type     Reason   Age                      From     Message
        ----     ------   ----                     ----     -------
        Normal   Pulled   44m (x475 over 40h)      kubelet  Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9" already present on machine
        Warning  BackOff  4m20s (x11271 over 40h)  kubelet  Back-off restarting failed container tuned in pod tuned-ldt8m_openshift-cluster-node-tuning-operator(27620d08-406d-4d1a-86e6-4d229a49da80)
      
      # oc logs tuned-ldt8m -n openshift-cluster-node-tuning-operator
      exec /var/lib/tuned/bin/run: exec format error
      
      # podman manifest inspect quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
      WARN[0001] The manifest type application/vnd.docker.distribution.manifest.v2+json is not a manifest list but a single image.
      {
          "schemaVersion": 2,
          "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
          "config": {
              "mediaType": "application/vnd.docker.container.image.v1+json",
              "size": 24880,
              "digest": "sha256:da83d2f46179b9993c068f8bdc9dfd998b5535b298beeb0061a7d7f3c9ae47c3"
          },
          "layers": [
              {
                  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                  "size": 78172747,
                  "digest": "sha256:ca1636478fe5b8e2a56600e24d6759147feb15020824334f4a798c1cb6ed58e2"
              },
              {
                  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                  "size": 27523684,
                  "digest": "sha256:7a1cc7533110153469f5d33f52378e003a43c555d584cd3590930c4c8b77d3d2"
              },
              {
                  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                  "size": 9184475,
                  "digest": "sha256:c8ba2360cffa7b96adfac8bac675c32de36714b5fc227d3f81e044ed3d68da89"
              },
              {
                  "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
                  "size": 123811857,
                  "digest": "sha256:5740df6957787c6afd1c8ca483306f10383b1a1de704c9c3f47f65a733251cf7"
              }
          ]
      }

      Must gather logs: https://drive.google.com/file/d/1qk27uwhfDwW2hipn-6EklzWFxHkUoHQm/view?usp=sharing

            jmencak Jiri Mencak
            openshift-crt-jira-prow OpenShift Prow Bot
            Liquan Cui Liquan Cui
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: