-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
4.14.0
-
No
-
False
-
This is a clone of issue OCPBUGS-18480. The following is the description of the original issue:
—
Description of problem:
The test is to create cluster on amd64 with single arch payload and migrate to multi-architecture manifest-listed payload and add ppc64le nodes to the cluster. The tuned pods for openshift-cluster-node-tuning-operator continue to be single arch post migration. The added ppc64le nodes also have tuned pods crashing due to single arch amd64 image.
Version-Release number of selected component (if applicable):
# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.14.0-ec.4 True False 4d Error while reconciling 4.14.0-ec.4: the cluster operator image-registry is degraded
Steps to Reproduce:
1. Install Openshift 4.14.0-ec.4 single arch (amd64) [must be a stable release) 2. Migrate to multi-arch payload using "oc adm upgrade channel candidate-4.14 && oc adm upgrade --to-multi-arch" 3. Once cluster is upgraded, add ppc64le nodes to the cluster. # oc get nodes NAME STATUS ROLES AGE VERSION rdr-800-syd-vpc-k5b4m-master-0 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-master-1 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-master-2 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-worker-1-6mmh9 Ready worker 4d1h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-worker-1-kr9nk Ready worker 4d1h v1.27.3+4aaeaec syd05-worker-0.test-ocp-33ab.example.internal Ready worker 40h v1.27.3+4aaeaec syd05-worker-1.test-ocp-33ab.example.internal Ready worker 40h v1.27.3+4aaeaec
Actual results:
After adding the ppc64le nodes, the tuned pods crash as it contains only the amd64 image payload. The rest of the pods on the nodes get created successfully.
# oc get po -A -owide | grep tuned | grep Crash openshift-cluster-node-tuning-operator tuned-ldt8m 0/1 CrashLoopBackOff 482 (26s ago) 40h 192.168.200.12 syd05-worker-1.test-ocp-33ab.example.internal <none> <none> openshift-cluster-node-tuning-operator tuned-mkwmx 0/1 CrashLoopBackOff 481 (4m ago) 40h 192.168.200.13 syd05-worker-0.test-ocp-33ab.example.internal <none> <none>
Expected results:
Expected that the tuned pod would have multi images and would run successfully.
Additional info:
# oc describe pod tuned-ldt8m -n openshift-cluster-node-tuning-operator Name: tuned-ldt8m Namespace: openshift-cluster-node-tuning-operator Priority: 2000001000 Priority Class Name: system-node-critical Service Account: tuned Node: syd05-worker-1.test-ocp-33ab.example.internal/192.168.200.12 Start Time: Fri, 01 Sep 2023 09:15:58 -0700 Labels: controller-revision-hash=df55cd6f9 openshift-app=tuned pod-template-generation=1 Annotations: openshift.io/scc: privileged Status: Running IP: 192.168.200.12 IPs: IP: 192.168.200.12 Controlled By: DaemonSet/tuned Containers: tuned: Container ID: cri-o://6853091d5272af7d9d4a0a9f207282a46c93a4f9dc536a99c344f27dc8c932d3 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9 Port: <none> Host Port: <none> Command: /var/lib/tuned/bin/run start State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Message: exec /var/lib/tuned/bin/run: exec format error Exit Code: 1 Started: Sun, 03 Sep 2023 02:02:03 -0700 Finished: Sun, 03 Sep 2023 02:02:03 -0700 Ready: False Restart Count: 483 Requests: cpu: 10m memory: 50Mi Environment: WATCH_NAMESPACE: openshift-cluster-node-tuning-operator (v1:metadata.namespace) OCP_NODE_NAME: (v1:spec.nodeName) RESYNC_PERIOD: 600 RELEASE_VERSION: 4.14.0-ec.4 Mounts: /etc/modprobe.d from etc-modprobe-d (rw) /etc/sysconfig from etc-sysconfig (rw) /etc/sysctl.conf from etc-sysctl-conf (ro) /etc/sysctl.d from etc-sysctl-d (ro) /etc/systemd from etc-systemd (rw) /host from host (rw) /lib/modules from lib-modules (ro) /run from run (rw) /sys from sys (rw) /var/lib/tuned/profiles-data from var-lib-tuned-profiles-data (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-plcsk (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: etc-modprobe-d: Type: HostPath (bare host directory volume) Path: /etc/modprobe.d HostPathType: Directory etc-sysconfig: Type: HostPath (bare host directory volume) Path: /etc/sysconfig HostPathType: Directory etc-sysctl-d: Type: HostPath (bare host directory volume) Path: /etc/sysctl.d HostPathType: Directory etc-sysctl-conf: Type: HostPath (bare host directory volume) Path: /etc/sysctl.conf HostPathType: File etc-systemd: Type: HostPath (bare host directory volume) Path: /etc/systemd HostPathType: Directory run: Type: HostPath (bare host directory volume) Path: /run HostPathType: Directory sys: Type: HostPath (bare host directory volume) Path: /sys HostPathType: Directory lib-modules: Type: HostPath (bare host directory volume) Path: /lib/modules HostPathType: Directory host: Type: HostPath (bare host directory volume) Path: / HostPathType: Directory var-lib-tuned-profiles-data: Type: ConfigMap (a volume populated by a ConfigMap) Name: tuned-profiles Optional: true kube-api-access-plcsk: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Pulled 44m (x475 over 40h) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9" already present on machine Warning BackOff 4m20s (x11271 over 40h) kubelet Back-off restarting failed container tuned in pod tuned-ldt8m_openshift-cluster-node-tuning-operator(27620d08-406d-4d1a-86e6-4d229a49da80) # oc logs tuned-ldt8m -n openshift-cluster-node-tuning-operator exec /var/lib/tuned/bin/run: exec format error # podman manifest inspect quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9 WARN[0001] The manifest type application/vnd.docker.distribution.manifest.v2+json is not a manifest list but a single image. { "schemaVersion": 2, "mediaType": "application/vnd.docker.distribution.manifest.v2+json", "config": { "mediaType": "application/vnd.docker.container.image.v1+json", "size": 24880, "digest": "sha256:da83d2f46179b9993c068f8bdc9dfd998b5535b298beeb0061a7d7f3c9ae47c3" }, "layers": [ { "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", "size": 78172747, "digest": "sha256:ca1636478fe5b8e2a56600e24d6759147feb15020824334f4a798c1cb6ed58e2" }, { "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", "size": 27523684, "digest": "sha256:7a1cc7533110153469f5d33f52378e003a43c555d584cd3590930c4c8b77d3d2" }, { "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", "size": 9184475, "digest": "sha256:c8ba2360cffa7b96adfac8bac675c32de36714b5fc227d3f81e044ed3d68da89" }, { "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip", "size": 123811857, "digest": "sha256:5740df6957787c6afd1c8ca483306f10383b1a1de704c9c3f47f65a733251cf7" } ] }
Must gather logs: https://drive.google.com/file/d/1qk27uwhfDwW2hipn-6EklzWFxHkUoHQm/view?usp=sharing
- clones
-
OCPBUGS-18480 tuned pods continue to have single arch image post --to-multi-arch migration
- Closed
- is blocked by
-
OCPBUGS-18480 tuned pods continue to have single arch image post --to-multi-arch migration
- Closed
- links to
-
RHBA-2023:5382 OpenShift Container Platform 4.13.z bug fix update