Resolution: Duplicate
Description of problem:
testing profile: 07_aarch64_UPI on Baremetal-packet & OVN, 4.11.8-aarch64 upgrade to 4.12.0-0.nightly-arm64-2022-10-10-023446, monitoring is degraded, the cluster is with 3 masters/2 workers.
10-12 08:02:06.793 oc get nodes: 10-12 08:02:06.793 NAME STATUS ROLES AGE VERSION 10-12 08:02:06.793 master-00.newugd-24256.qe.devcluster.openshift.com Ready master 45m v1.24.0+dc5a2fd 10-12 08:02:06.793 master-01.newugd-24256.qe.devcluster.openshift.com Ready master 47m v1.24.0+dc5a2fd 10-12 08:02:06.793 master-02.newugd-24256.qe.devcluster.openshift.com Ready master 47m v1.24.0+dc5a2fd 10-12 08:02:06.793 worker-00.newugd-24256.qe.devcluster.openshift.com Ready worker 30m v1.24.0+dc5a2fd 10-12 08:02:06.793 worker-01.newugd-24256.qe.devcluster.openshift.com Ready worker 30m v1.24.0+dc5a2fd
- lastTransitionTime: "2022-10-12T02:41:52Z" message: 'reconciling Prometheus Operator Admission Webhook Deployment failed: updating Deployment object failed: waiting for DeploymentRollout of openshift-monitoring/prometheus-operator-admission-webhook: the number of pods targeted by the deployment (3 pods) is different from the number of pods targeted by the deployment that have the desired template spec (2 pods)' reason: UpdatingPrometheusOperatorFailed status: "True" type: Degraded
checked from must-gather file, there are 3 prometheus-operator-admission-webhook pods
prometheus-operator-admission-webhook-7df64f454f-bfmtl prometheus-operator-admission-webhook-64cb6b847-4fg6m prometheus-operator-admission-webhook-64cb6b847-45kz4
prometheus-operator-admission-webhook-7df64f454f-bfmtl is running and on node worker-01.newugd-24256.qe.devcluster.openshift.com, which the pod would be replaced later.
prometheus-operator-admission-webhook-64cb6b847-4fg6m is on worker-00.newugd-24256.qe.devcluster.openshift.com node, pod is CreateContainerError, the error is rarely seen, and it is caused by runc
- image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d37d4ba4ef834ddfc639301253c2ce593d5fe2806adb1abe52f822f3601fb31c imageID: "" lastState: {} name: prometheus-operator-admission-webhook ready: false restartCount: 0 started: false state: waiting: message: | container create failed: time="2022-10-12T03:20:47Z" level=error msg="runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524" reason: CreateContainerError hostIP: phase: Pending podIP: podIPs: - ip: qosClass: Burstable startTime: "2022-10-12T02:31:57Z"
prometheus-operator-admission-webhook-64cb6b847-45kz4 is Pending due to the podAntiAffinity rule which is expected
status: conditions: - lastProbeTime: null lastTransitionTime: "2022-10-12T02:31:52Z" message: '0/5 nodes are available: 2 node(s) didn''t match pod anti-affinity rules, 3 node(s) had untolerated taint {node-role.kubernetes.io/master: }. preemption: 0/5 nodes are available: 2 node(s) didn''t match pod anti-affinity rules, 3 Preemption is not helpful for scheduling.' reason: Unschedulable status: "False" type: PodScheduled phase: Pending qosClass: Burstable
Version-Release number of selected component (if applicable):
4.11.8-aarch64 upgrade to 4.12.0-0.nightly-arm64-2022-10-10-023446
How reproducible:
not always
Steps to Reproduce:
1. 4.11.8-aarch64 upgrade to 4.12.0-0.nightly-arm64-2022-10-10-023446 2. 3.
Actual results:
4.11.8-aarch64 upgrade to 4.12.0-0.nightly-arm64-2022-10-10-023446, monitoring is degraded
Expected results:
no error for upgrade
Additional info:
must-gather file: https://drive.google.com/file/d/1yS6s74M3t2zOKpssiBTAJ7yFzV4cSic6/view?usp=sharing
- duplicates
OCPBUGS-2637 [ARM64][4.11.0+] Containers are stuck in CreateError with 'error loading seccomp filter: errno 524'
- Closed
- is blocked by
RUN-1668 Impact: 4.11 upgrade to 4.12, prometheus-operator-admission-webhook pod is failed to start up due to "error loading seccomp filter into kernel: error loading seccomp filter: errno 524"
- Closed
- relates to
OCPBUGS-708 UpdatingKubeStateMetricsFailed before Upgrade
- Closed
OCPBUGS-1882 runc create failed: unable to start container process: unable to init seccomp: error loading seccomp filter into kernel: error loading seccomp filter: errno 524
- Closed