-
Bug
-
Resolution: Done-Errata
-
Undefined
-
None
-
4.14.0
-
Quality / Stability / Reliability
-
False
-
-
None
-
None
-
No
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
-
None
Description of problem:
The test is to create cluster on amd64 with single arch payload and migrate to multi-architecture manifest-listed payload and add ppc64le nodes to the cluster. The tuned pods for openshift-cluster-node-tuning-operator continue to be single arch post migration. The added ppc64le nodes also have tuned pods crashing due to single arch amd64 image.
Version-Release number of selected component (if applicable):
# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.14.0-ec.4 True False 4d Error while reconciling 4.14.0-ec.4: the cluster operator image-registry is degraded
Steps to Reproduce:
1. Install Openshift 4.14.0-ec.4 single arch on (amd64) [must be a stable release] 2. Migrate to multi-arch payload using "oc adm upgrade channel candidate-4.14 && oc adm upgrade --to-multi-arch" 3. Once cluster is upgraded, add ppc64le nodes to the cluster. # oc get nodes NAME STATUS ROLES AGE VERSION rdr-800-syd-vpc-k5b4m-master-0 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-master-1 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-master-2 Ready control-plane,master 4d2h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-worker-1-6mmh9 Ready worker 4d1h v1.27.3+4aaeaec rdr-800-syd-vpc-k5b4m-worker-1-kr9nk Ready worker 4d1h v1.27.3+4aaeaec syd05-worker-0.test-ocp-33ab.example.internal Ready worker 40h v1.27.3+4aaeaec syd05-worker-1.test-ocp-33ab.example.internal Ready worker 40h v1.27.3+4aaeaec
Actual results:
After adding the ppc64le nodes, the tuned pods crash as it contains only the amd64 image payload. The rest of the pods on the nodes get created successfully.
# oc get po -A -owide | grep tuned | grep Crash openshift-cluster-node-tuning-operator tuned-ldt8m 0/1 CrashLoopBackOff 482 (26s ago) 40h 192.168.200.12 syd05-worker-1.test-ocp-33ab.example.internal <none> <none> openshift-cluster-node-tuning-operator tuned-mkwmx 0/1 CrashLoopBackOff 481 (4m ago) 40h 192.168.200.13 syd05-worker-0.test-ocp-33ab.example.internal <none> <none>
Expected results:
Expected that the tuned pod would have multi images and would run successfully.
Additional info:
# oc describe pod tuned-ldt8m -n openshift-cluster-node-tuning-operator
Name: tuned-ldt8m
Namespace: openshift-cluster-node-tuning-operator
Priority: 2000001000
Priority Class Name: system-node-critical
Service Account: tuned
Node: syd05-worker-1.test-ocp-33ab.example.internal/192.168.200.12
Start Time: Fri, 01 Sep 2023 09:15:58 -0700
Labels: controller-revision-hash=df55cd6f9
openshift-app=tuned
pod-template-generation=1
Annotations: openshift.io/scc: privileged
Status: Running
IP: 192.168.200.12
IPs:
IP: 192.168.200.12
Controlled By: DaemonSet/tuned
Containers:
tuned:
Container ID: cri-o://6853091d5272af7d9d4a0a9f207282a46c93a4f9dc536a99c344f27dc8c932d3
Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
Port: <none>
Host Port: <none>
Command:
/var/lib/tuned/bin/run
start
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Message: exec /var/lib/tuned/bin/run: exec format error Exit Code: 1
Started: Sun, 03 Sep 2023 02:02:03 -0700
Finished: Sun, 03 Sep 2023 02:02:03 -0700
Ready: False
Restart Count: 483
Requests:
cpu: 10m
memory: 50Mi
Environment:
WATCH_NAMESPACE: openshift-cluster-node-tuning-operator (v1:metadata.namespace)
OCP_NODE_NAME: (v1:spec.nodeName)
RESYNC_PERIOD: 600
RELEASE_VERSION: 4.14.0-ec.4
Mounts:
/etc/modprobe.d from etc-modprobe-d (rw)
/etc/sysconfig from etc-sysconfig (rw)
/etc/sysctl.conf from etc-sysctl-conf (ro)
/etc/sysctl.d from etc-sysctl-d (ro)
/etc/systemd from etc-systemd (rw)
/host from host (rw)
/lib/modules from lib-modules (ro)
/run from run (rw)
/sys from sys (rw)
/var/lib/tuned/profiles-data from var-lib-tuned-profiles-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-plcsk (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
etc-modprobe-d:
Type: HostPath (bare host directory volume)
Path: /etc/modprobe.d
HostPathType: Directory
etc-sysconfig:
Type: HostPath (bare host directory volume)
Path: /etc/sysconfig
HostPathType: Directory
etc-sysctl-d:
Type: HostPath (bare host directory volume)
Path: /etc/sysctl.d
HostPathType: Directory
etc-sysctl-conf:
Type: HostPath (bare host directory volume)
Path: /etc/sysctl.conf
HostPathType: File
etc-systemd:
Type: HostPath (bare host directory volume)
Path: /etc/systemd
HostPathType: Directory
run:
Type: HostPath (bare host directory volume)
Path: /run
HostPathType: Directory
sys:
Type: HostPath (bare host directory volume)
Path: /sys
HostPathType: Directory
lib-modules:
Type: HostPath (bare host directory volume)
Path: /lib/modules
HostPathType: Directory
host:
Type: HostPath (bare host directory volume)
Path: /
HostPathType: Directory
var-lib-tuned-profiles-data:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: tuned-profiles
Optional: true
kube-api-access-plcsk:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
ConfigMapName: openshift-service-ca.crt
ConfigMapOptional: <nil>
QoS Class: Burstable
Node-Selectors: kubernetes.io/os=linux
Tolerations: op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Pulled 44m (x475 over 40h) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9" already present on machine
Warning BackOff 4m20s (x11271 over 40h) kubelet Back-off restarting failed container tuned in pod tuned-ldt8m_openshift-cluster-node-tuning-operator(27620d08-406d-4d1a-86e6-4d229a49da80)
# oc logs tuned-ldt8m -n openshift-cluster-node-tuning-operator
exec /var/lib/tuned/bin/run: exec format error
# podman manifest inspect quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:cb5ac9e611fd6c0465d0f091c78fc648d9c570b303faf9e0032d3018c722a7e9
WARN[0001] The manifest type application/vnd.docker.distribution.manifest.v2+json is not a manifest list but a single image.
{
"schemaVersion": 2,
"mediaType": "application/vnd.docker.distribution.manifest.v2+json",
"config": {
"mediaType": "application/vnd.docker.container.image.v1+json",
"size": 24880,
"digest": "sha256:da83d2f46179b9993c068f8bdc9dfd998b5535b298beeb0061a7d7f3c9ae47c3"
},
"layers": [
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 78172747,
"digest": "sha256:ca1636478fe5b8e2a56600e24d6759147feb15020824334f4a798c1cb6ed58e2"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 27523684,
"digest": "sha256:7a1cc7533110153469f5d33f52378e003a43c555d584cd3590930c4c8b77d3d2"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 9184475,
"digest": "sha256:c8ba2360cffa7b96adfac8bac675c32de36714b5fc227d3f81e044ed3d68da89"
},
{
"mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
"size": 123811857,
"digest": "sha256:5740df6957787c6afd1c8ca483306f10383b1a1de704c9c3f47f65a733251cf7"
}
]
}
Must gather logs: https://drive.google.com/file/d/1qk27uwhfDwW2hipn-6EklzWFxHkUoHQm/view?usp=sharing
- blocks
-
OCPBUGS-18578 tuned pods continue to have single arch image post --to-multi-arch migration
-
- Closed
-
- is cloned by
-
OCPBUGS-18578 tuned pods continue to have single arch image post --to-multi-arch migration
-
- Closed
-
- links to
-
RHSA-2023:5006
OpenShift Container Platform 4.14.z security update