-
Bug
-
Resolution: Duplicate
-
Major
-
None
-
4.14
-
Important
-
No
-
False
-
Description of problem:
An attempt to upgrade the cluster from OCP 4.13.23 to 4.14.10 resulted into failure. I waited nearly >120 mins before raising the bug, I see multus-* pods going into CrashLoopBackOff
Here's the OCP Config of the said cluster:
Master Nodes: Standard_D32s_v5 x 3 Infra Nodes: Standard_E16s_v3 x 3 Worker Nodes: Standard_D8s_v5 x 252
Version-Release number of selected component (if applicable):
From OCP Version OCP 4.13.23 To OCP Version: OCP 4.14.10 [channel: candidate-4.13]
Steps to Reproduce:
1. kube-burner ocp cluster-density-v2 --gc=false --iterations=2268 --churn=false 2. oc adm upgrade channel candidate-4.14 3. oc adm upgrade --to=4.14.10
Actual results:
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.23 True True 107m Unable to apply 4.14.10: wait has exceeded 40 minutes for these operators: network $
Expected results:
OCP Cluster should have upgraded to 4.14.10
Additional info:
$ oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.13.23 True True 107m Unable to apply 4.14.10: wait has exceeded 40 minutes for these operators: network $ ========= ========= ========= ========= ========= ========= ========= $ oc get co oc getNAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE aro v20231214.00 True False False 6h27m authentication 4.14.10 True False False 19m cloud-controller-manager 4.14.10 True False False 6h46m cloud-credential 4.14.10 True False False 6h46m cluster-autoscaler 4.14.10 True False False 6h43m config-operator 4.14.10 True False False 6h44m console 4.14.10 False False False 16m RouteHealthAvailable: failed to GET route (https://console-openshift-console.apps.ujio6jgc.eastus.aroapp.io): Get "https://console-openshift-console.apps.ujio6jgc.eastus.aroapp.io": context deadline exceeded (Client.Timeout exceeded while awaiting headers) control-plane-machine-set 4.14.10 True False False 6h13m csi-snapshot-controller 4.14.10 True False False 6h44m dns 4.13.23 True False False 6h43m etcd 4.14.10 True False False 6h42m image-registry 4.14.10 True False False 6h34m ingress 4.14.10 True False False 6h37m insights 4.14.10 True False False 6h37m kube-apiserver 4.14.10 True False False 6h40m kube-controller-manager 4.14.10 True False False 6h41m kube-scheduler 4.14.10 True False False 6h40m kube-storage-version-migrator 4.14.10 True False False 6h16m machine-api 4.14.10 True False False 6h38m machine-approver 4.14.10 True False False 6h43m machine-config 4.13.23 True False False 5h55m marketplace 4.14.10 True False False 6h44m monitoring 4.14.10 True False False 6h34m network 4.13.23 True True True 6h46m DaemonSet "/openshift-ovn-kubernetes/ovnkube-node" rollout is not making progress - pod ovnkube-node-fqbj2 is in CrashLoopBackOff State... node-tuning 4.14.10 True False False 98m openshift-apiserver 4.14.10 True False False 6h34m openshift-controller-manager 4.14.10 True False False 6h40m openshift-samples 4.14.10 True False False 99m operator-lifecycle-manager 4.14.10 True False False 6h43m operator-lifecycle-manager-catalog 4.14.10 True False False 6h44m operator-lifecycle-manager-packageserver 4.14.10 True False False 6h37m service-ca 4.14.10 True False False 6h44m storage 4.14.10 True False False 6h41m $ ========= ========= ========= ========= ========= ========= ========= $ oc describe po ovnkube-node-fqbj2 -n openshift-ovn-kubernetes Name: ovnkube-node-fqbj2 Namespace: openshift-ovn-kubernetes Priority: 2000001000 Priority Class Name: system-node-critical Service Account: ovn-kubernetes-node Node: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24/10.0.2.9 Start Time: Sat, 20 Jan 2024 18:10:44 +0530 Labels: app=ovnkube-node component=network controller-revision-hash=75b68c94cc kubernetes.io/os=linux openshift.io/component=network ovn-db-pod=true pod-template-generation=3 type=infra Annotations: network.operator.openshift.io/ovnkube-script-lib-hash: 6be38e6ebb2bcceb1014eb6b02513a9df1d4e90e networkoperator.openshift.io/cluster-network-cidr: 10.128.0.0/14 networkoperator.openshift.io/hybrid-overlay-status: disabled networkoperator.openshift.io/ip-family-mode: single-stack Status: Running IP: 10.0.2.9 IPs: IP: 10.0.2.9 Controlled By: DaemonSet/ovnkube-node Init Containers: kubecfg-setup: Container ID: cri-o://c921ed05a0380e25202f531648682f1f152bc5d72decd7e5052d562da2198a39 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c cat << EOF > /etc/ovn/kubeconfig apiVersion: v1 clusters: - cluster: certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt server: https://api-int.ujio6jgc.eastus.aroapp.io:6443 name: default-cluster contexts: - context: cluster: default-cluster namespace: default user: default-auth name: default-context current-context: default-context kind: Config preferences: {} users: - name: default-auth user: client-certificate: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem client-key: /etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem EOF State: Terminated Reason: Completed Exit Code: 0 Started: Sat, 20 Jan 2024 18:10:44 +0530 Finished: Sat, 20 Jan 2024 18:10:44 +0530 Ready: True Restart Count: 0 Environment: <none> Mounts: /etc/ovn/ from etc-openvswitch (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) Containers: ovn-controller: Container ID: cri-o://e12b1677b8004ff340fbad02c0894c9e73b0a1646a1d92a7f2aa864f650c8fe8 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -e . /ovnkube-lib/ovnkube-lib.sh || exit 1 start-ovn-controller ${OVN_LOG_LEVEL} State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 300Mi Environment: OVN_LOG_LEVEL: info K8S_NODE: (v1:spec.nodeName) Mounts: /dev/log from log-socket (rw) /env from env-overrides (rw) /etc/openvswitch from etc-openvswitch (rw) /etc/ovn/ from etc-openvswitch (rw) /ovnkube-lib from ovnkube-script-lib (rw) /run/openvswitch from run-openvswitch (rw) /run/ovn/ from run-ovn (rw) /var/lib/openvswitch from var-lib-openvswitch (rw) /var/log/ovn/ from node-log (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) ovn-acl-logging: Container ID: cri-o://dcd4c74701601fe371ffb14c14979b84d40474016d4a18fbeef71f1fca9b2028 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -euo pipefail . /ovnkube-lib/ovnkube-lib.sh || exit 1 start-audit-log-rotation State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 20Mi Environment: <none> Mounts: /ovnkube-lib from ovnkube-script-lib (rw) /run/ovn/ from run-ovn (rw) /var/log/ovn/ from node-log (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) kube-rbac-proxy-node: Container ID: cri-o://39d73c22ee520d96165c3a8041b782a8b74caec9eb720240f4700d9ba4c69f7d Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40 Port: 9103/TCP Host Port: 9103/TCP Command: /bin/bash -c #!/bin/bash set -euo pipefail . /ovnkube-lib/ovnkube-lib.sh || exit 1 start-rbac-proxy-node ovn-node-metrics 9103 29103 /etc/pki/tls/metrics-cert/tls.key /etc/pki/tls/metrics-cert/tls.crt State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 20Mi Environment: <none> Mounts: /etc/pki/tls/metrics-cert from ovn-node-metrics-cert (ro) /ovnkube-lib from ovnkube-script-lib (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) kube-rbac-proxy-ovn-metrics: Container ID: cri-o://619d0a55a1830d54f37a595a0b704bf0df8feb0cc0f8a01ae45ad8028d0215b4 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40 Port: 9105/TCP Host Port: 9105/TCP Command: /bin/bash -c #!/bin/bash set -euo pipefail . /ovnkube-lib/ovnkube-lib.sh || exit 1 start-rbac-proxy-node ovn-metrics 9105 29105 /etc/pki/tls/metrics-cert/tls.key /etc/pki/tls/metrics-cert/tls.crt State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 20Mi Environment: <none> Mounts: /etc/pki/tls/metrics-cert from ovn-node-metrics-cert (ro) /ovnkube-lib from ovnkube-script-lib (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) northd: Container ID: cri-o://2f802d29cce2792f0106e9ddbee6cc2e2fe4386d476f9d7b37d8dafd7604f8df Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -xem if [[ -f /env/_master ]]; then set -o allexport source /env/_master set +o allexport fi . /ovnkube-lib/ovnkube-lib.sh || exit 1 trap quit-ovn-northd TERM INT start-ovn-northd "${OVN_LOG_LEVEL}" State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 70Mi Environment: OVN_LOG_LEVEL: info Mounts: /env from env-overrides (rw) /etc/ovn from etc-openvswitch (rw) /ovnkube-lib from ovnkube-script-lib (rw) /run/ovn/ from run-ovn (rw) /var/log/ovn from node-log (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) nbdb: Container ID: cri-o://c6882acc3afe8a56219f427bd77b6a4441554f87a897cb5fd77e09453e695856 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -xem if [[ -f /env/_master ]]; then set -o allexport source /env/_master set +o allexport fi . /ovnkube-lib/ovnkube-lib.sh || exit 1 trap quit-nbdb TERM INT start-nbdb ${OVN_LOG_LEVEL} State: Running Started: Sat, 20 Jan 2024 18:10:45 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 300Mi Readiness: exec [/bin/bash -c set -xeo pipefail . /ovnkube-lib/ovnkube-lib.sh || exit 1 ovndb-readiness-probe "nb" ] delay=10s timeout=5s period=10s #success=1 #failure=3 Environment: OVN_LOG_LEVEL: info K8S_NODE: (v1:spec.nodeName) Mounts: /env from env-overrides (rw) /etc/ovn/ from etc-openvswitch (rw) /ovnkube-lib from ovnkube-script-lib (rw) /run/ovn/ from run-ovn (rw) /var/log/ovn from node-log (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) sbdb: Container ID: cri-o://6444f8cc7591e48811da7681b6d792ea95a66bbddd034efeb2032e7c40697e63 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -xem if [[ -f /env/_master ]]; then set -o allexport source /env/_master set +o allexport fi . /ovnkube-lib/ovnkube-lib.sh || exit 1 trap quit-sbdb TERM INT start-sbdb ${OVN_LOG_LEVEL} State: Running Started: Sat, 20 Jan 2024 18:10:48 +0530 Ready: True Restart Count: 0 Requests: cpu: 10m memory: 300Mi Readiness: exec [/bin/bash -c set -xeo pipefail . /ovnkube-lib/ovnkube-lib.sh || exit 1 ovndb-readiness-probe "sb" ] delay=10s timeout=5s period=10s #success=1 #failure=3 Environment: OVN_LOG_LEVEL: info Mounts: /env from env-overrides (rw) /etc/ovn/ from etc-openvswitch (rw) /ovnkube-lib from ovnkube-script-lib (rw) /run/ovn/ from run-ovn (rw) /var/log/ovn from node-log (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) ovnkube-controller: Container ID: cri-o://2939dc4e5a3a9792c7b7f371fb1892a143ec1a1b9f38555966f1229e64e0c11a Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: 29105/TCP Host Port: 29105/TCP Command: /bin/bash -c set -xe . /ovnkube-lib/ovnkube-lib.sh || exit 1 start-ovnkube-node ${OVN_KUBE_LOG_LEVEL} 29103 29105 State: Running Started: Sat, 20 Jan 2024 18:57:40 +0530 Last State: Terminated Reason: Error Message: :555] Update event received for resource *factory.localPodSelector, old object is equal to new: false I0120 13:22:48.814174 393590 obj_retry.go:607] Update event received for *factory.localPodSelector cluster-density-v2-2086/client-2-5dfb65fdf9-294hm I0120 13:22:48.853487 393590 obj_retry.go:555] Update event received for resource *v1.Pod, old object is equal to new: false I0120 13:22:48.853510 393590 default_network_controller.go:650] Recording update event on pod cluster-density-v2-105/client-2-864d8f57d7-ktcsm I0120 13:22:48.853525 393590 obj_retry.go:607] Update event received for *v1.Pod cluster-density-v2-105/client-2-864d8f57d7-ktcsm I0120 13:22:48.853536 393590 ovn.go:132] Ensuring zone remote for Pod cluster-density-v2-105/client-2-864d8f57d7-ktcsm in node krishvoor-v5-252-jdfwd-worker-eastus3-prg4q I0120 13:22:48.853543 393590 default_network_controller.go:679] Recording success event on pod cluster-density-v2-105/client-2-864d8f57d7-ktcsm I0120 13:22:48.853551 393590 obj_retry.go:555] Update event received for resource *factory.egressIPPod, old object is equal to new: false I0120 13:22:48.853558 393590 obj_retry.go:607] Update event received for *factory.egressIPPod cluster-density-v2-105/c Exit Code: 1 Started: Sat, 20 Jan 2024 18:47:39 +0530 Finished: Sat, 20 Jan 2024 18:52:48 +0530 Ready: False Restart Count: 7 Requests: cpu: 10m memory: 600Mi Readiness: exec [test -f /etc/cni/net.d/10-ovn-kubernetes.conf] delay=5s timeout=1s period=30s #success=1 #failure=3 Environment: KUBERNETES_SERVICE_PORT: 6443 KUBERNETES_SERVICE_HOST: api-int.ujio6jgc.eastus.aroapp.io OVN_CONTROLLER_INACTIVITY_PROBE: 180000 OVN_KUBE_LOG_LEVEL: 4 K8S_NODE: (v1:spec.nodeName) POD_NAME: ovnkube-node-fqbj2 (v1:metadata.name) Mounts: /cni-bin-dir from host-cni-bin (rw) /env from env-overrides (rw) /etc/cni/net.d from host-cni-netd (rw) /etc/openvswitch from etc-openvswitch (rw) /etc/ovn/ from etc-openvswitch (rw) /etc/systemd/system from systemd-units (ro) /host from host-slash (ro) /ovnkube-lib from ovnkube-script-lib (rw) /run/netns from host-run-netns (ro) /run/openvswitch from run-openvswitch (rw) /run/ovn-kubernetes/ from host-run-ovn-kubernetes (rw) /run/ovn/ from run-ovn (rw) /run/ovnkube-config/ from ovnkube-config (rw) /var/lib/cni/networks/ovn-k8s-cni-overlay from host-var-lib-cni-networks-ovn-kubernetes (rw) /var/lib/kubelet from host-kubelet (ro) /var/lib/openvswitch from var-lib-openvswitch (rw) /var/log/ovnkube/ from etc-openvswitch (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) drop-icmp: Container ID: cri-o://a3c0696059d9723c70331b88d62616d55ef56322a6f67e52b713c0bc409fb7a7 Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a Port: <none> Host Port: <none> Command: /bin/bash -c set -xe # Wait for cert file retries=0 tries=20 key_cert="/etc/ovn/ovnkube-node-certs/ovnkube-client-current.pem" while [ ! -f "${key_cert}" ]; do (( retries += 1 )) if [[ "${retries}" -gt ${tries} ]]; then echo "$(date -Iseconds) - ERROR - ${key_cert} not found" return 1 fi sleep 1 done export KUBECONFIG=/etc/ovn/kubeconfig touch /var/run/ovn/add_iptables.sh chmod 0755 /var/run/ovn/add_iptables.sh cat <<'EOF' > /var/run/ovn/add_iptables.sh #!/bin/sh if [ -z "$3" ] then echo "Called with host address missing, ignore" exit 0 fi echo "Adding ICMP drop rule for '$3' " if iptables -C CHECK_ICMP_SOURCE -p icmp -s $3 -j ICMP_ACTION then echo "iptables already set for $3" else iptables -A CHECK_ICMP_SOURCE -p icmp -s $3 -j ICMP_ACTION fi EOF echo "I$(date "+%m%d %H:%M:%S.%N") - drop-icmp - start drop-icmp ${K8S_NODE}" iptables -X CHECK_ICMP_SOURCE || true iptables -N CHECK_ICMP_SOURCE || true iptables -F CHECK_ICMP_SOURCE iptables -D INPUT -p icmp --icmp-type fragmentation-needed -j CHECK_ICMP_SOURCE || true iptables -I INPUT -p icmp --icmp-type fragmentation-needed -j CHECK_ICMP_SOURCE iptables -N ICMP_ACTION || true iptables -F ICMP_ACTION iptables -A ICMP_ACTION -j LOG iptables -A ICMP_ACTION -j DROP # ip addr show ip route show iptables -nvL iptables -nvL -t nat oc observe pods -n openshift-ovn-kubernetes --listen-addr='' -l app=ovnkube-node -a '{ .status.hostIP }' -- /var/run/ovn/add_iptables.sh #systemd-run -qPG -- oc observe pods -n openshift-ovn-kubernetes --listen-addr='' -l app=ovnkube-node -a '{ .status.hostIP }' -- /var/run/ovn/add_iptables.sh State: Running Started: Sat, 20 Jan 2024 18:10:50 +0530 Ready: True Restart Count: 0 Requests: cpu: 5m memory: 20Mi Environment: K8S_NODE: (v1:spec.nodeName) Mounts: /etc/ovn/ from etc-openvswitch (rw) /host from host-slash (ro) /run/ovn/ from run-ovn (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-btt99 (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: host-kubelet: Type: HostPath (bare host directory volume) Path: /var/lib/kubelet HostPathType: systemd-units: Type: HostPath (bare host directory volume) Path: /etc/systemd/system HostPathType: host-slash: Type: HostPath (bare host directory volume) Path: / HostPathType: host-run-netns: Type: HostPath (bare host directory volume) Path: /run/netns HostPathType: var-lib-openvswitch: Type: HostPath (bare host directory volume) Path: /var/lib/openvswitch/data HostPathType: etc-openvswitch: Type: HostPath (bare host directory volume) Path: /var/lib/ovn-ic/etc HostPathType: run-openvswitch: Type: HostPath (bare host directory volume) Path: /var/run/openvswitch HostPathType: run-ovn: Type: HostPath (bare host directory volume) Path: /var/run/ovn-ic HostPathType: node-log: Type: HostPath (bare host directory volume) Path: /var/log/ovn HostPathType: log-socket: Type: HostPath (bare host directory volume) Path: /dev/log HostPathType: host-run-ovn-kubernetes: Type: HostPath (bare host directory volume) Path: /run/ovn-kubernetes HostPathType: host-cni-bin: Type: HostPath (bare host directory volume) Path: /var/lib/cni/bin HostPathType: host-cni-netd: Type: HostPath (bare host directory volume) Path: /var/run/multus/cni/net.d HostPathType: host-var-lib-cni-networks-ovn-kubernetes: Type: HostPath (bare host directory volume) Path: /var/lib/cni/networks/ovn-k8s-cni-overlay HostPathType: ovnkube-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: ovnkube-config Optional: false env-overrides: Type: ConfigMap (a volume populated by a ConfigMap) Name: env-overrides Optional: true ovn-node-metrics-cert: Type: Secret (a volume populated by a Secret) SecretName: ovn-node-metrics-cert Optional: true ovnkube-script-lib: Type: ConfigMap (a volume populated by a ConfigMap) Name: ovnkube-script-lib Optional: false kube-api-access-btt99: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: beta.kubernetes.io/os=linux Tolerations: op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 48m default-scheduler Successfully assigned openshift-ovn-kubernetes/ovnkube-node-fqbj2 to krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container kubecfg-setup Normal Started 48m kubelet Started container kubecfg-setup Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container ovn-controller Normal Started 48m kubelet Started container ovn-controller Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container ovn-acl-logging Normal Started 48m kubelet Started container ovn-acl-logging Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40" already present on machine Normal Created 48m kubelet Created container kube-rbac-proxy-node Normal Started 48m kubelet Started container kube-rbac-proxy-node Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:a7df0e3d16ea1f9ec17132a177be67b46d709368adf409e8f1f9b28fdc9aac40" already present on machine Normal Created 48m kubelet Created container kube-rbac-proxy-ovn-metrics Normal Started 48m kubelet Started container kube-rbac-proxy-ovn-metrics Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container northd Normal Started 48m kubelet Started container northd Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container nbdb Normal Started 48m kubelet Started container nbdb Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Normal Created 48m kubelet Created container sbdb Normal Started 48m kubelet Started container sbdb Normal Pulled 48m kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:368149fc734294fe7c851246f91738ef4d652fc83c32e2477d4eb20f1f41643a" already present on machine Warning Unhealthy 8m34s (x94 over 48m) kubelet Readiness probe failed: Warning BackOff 3m24s (x41 over 38m) kubelet Back-off restarting failed container ovnkube-controller in pod ovnkube-node-fqbj2_openshift-ovn-kubernetes(d2183ec1-2c8c-4077-933c-4715d7029feb) $ ========= ========= ========= ========= ========= ========= ========= $ oc describe no krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 Name: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 Roles: infra Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/instance-type=Standard_E16s_v3 beta.kubernetes.io/os=linux failure-domain.beta.kubernetes.io/region=eastus failure-domain.beta.kubernetes.io/zone=eastus-2 kubernetes.io/arch=amd64 kubernetes.io/hostname=krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 kubernetes.io/os=linux node-role.kubernetes.io/infra= node.kubernetes.io/instance-type=Standard_E16s_v3 node.openshift.io/os_id=rhcos topology.disk.csi.azure.com/zone=eastus-2 topology.kubernetes.io/region=eastus topology.kubernetes.io/zone=eastus-2 Annotations: cloud.network.openshift.io/egress-ipconfig: [{"interface":"krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24-nic","ifaddr":{"ipv4":"10.0.2.0/23"},"capacity":{"ip":255}}] csi.volume.kubernetes.io/nodeid: {"disk.csi.azure.com":"krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24","file.csi.azure.com":"krishvoor-v5-252-jdfwd-infra-aro... k8s.ovn.org/host-addresses: ["10.0.2.9"] k8s.ovn.org/host-cidrs: ["10.0.2.9/23"] k8s.ovn.org/l3-gateway-config: {"default":{"mode":"shared","interface-id":"br-ex_krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24","mac-address":"00:0d:3a:99:... k8s.ovn.org/network-ids: {"default":"0"} k8s.ovn.org/node-chassis-id: 925d815f-e2f4-4b73-a9bc-cf454e20a949 k8s.ovn.org/node-gateway-router-lrp-ifaddr: {"ipv4":"100.64.0.73/16"} k8s.ovn.org/node-id: 73 k8s.ovn.org/node-mgmt-port-mac-address: 42:c5:42:23:ff:57 k8s.ovn.org/node-primary-ifaddr: {"ipv4":"10.0.2.9/23"} k8s.ovn.org/node-subnets: {"default":["10.128.4.0/23"]} k8s.ovn.org/node-transit-switch-port-ifaddr: {"ipv4":"100.88.0.73/16"} k8s.ovn.org/zone-name: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 machine.openshift.io/machine: openshift-machine-api/krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 machineconfiguration.openshift.io/controlPlaneTopology: HighlyAvailable machineconfiguration.openshift.io/currentConfig: rendered-worker-f97e1c40ff3b5bf853d26745c99b8209 machineconfiguration.openshift.io/desiredConfig: rendered-worker-f97e1c40ff3b5bf853d26745c99b8209 machineconfiguration.openshift.io/desiredDrain: uncordon-rendered-worker-f97e1c40ff3b5bf853d26745c99b8209 machineconfiguration.openshift.io/lastAppliedDrain: uncordon-rendered-worker-f97e1c40ff3b5bf853d26745c99b8209 machineconfiguration.openshift.io/lastSyncedControllerConfigResourceVersion: 30595 machineconfiguration.openshift.io/reason: machineconfiguration.openshift.io/state: Done volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Sat, 20 Jan 2024 12:42:27 +0530 Taints: node-role.kubernetes.io/infra:NoSchedule Unschedulable: false Lease: HolderIdentity: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 AcquireTime: <unset> RenewTime: Sat, 20 Jan 2024 19:05:22 +0530 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Sat, 20 Jan 2024 19:03:53 +0530 Sat, 20 Jan 2024 12:42:27 +0530 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Sat, 20 Jan 2024 19:03:53 +0530 Sat, 20 Jan 2024 12:42:27 +0530 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Sat, 20 Jan 2024 19:03:53 +0530 Sat, 20 Jan 2024 12:42:27 +0530 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Sat, 20 Jan 2024 19:03:53 +0530 Sat, 20 Jan 2024 12:43:14 +0530 KubeletReady kubelet is posting ready status Addresses: Hostname: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 InternalIP: 10.0.2.9 Capacity: attachable-volumes-azure-disk: 32 cpu: 16 ephemeral-storage: 133626860Ki hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 131896464Ki pods: 250 Allocatable: attachable-volumes-azure-disk: 32 cpu: 15890m ephemeral-storage: 122076772149 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 122356880Ki pods: 250 System Info: Machine ID: afba4259ded84f4a9f04d3326502f9b8 System UUID: 88dc10fd-f0bb-9f46-ab1b-d5ddd6a21230 Boot ID: d764dbf4-7576-4e7c-8ba0-2bfb30bceeb7 Kernel Version: 5.14.0-284.41.1.el9_2.x86_64 OS Image: Red Hat Enterprise Linux CoreOS 413.92.202311151359-0 (Plow) Operating System: linux Architecture: amd64 Container Runtime Version: cri-o://1.26.4-5.1.rhaos4.13.git969e013.el9 Kubelet Version: v1.26.9+636f2be Kube-Proxy Version: v1.26.9+636f2be ProviderID: azure:///subscriptions/0f15d975-3110-4b83-934d-802cb96d6cc4/resourceGroups/aro-ujio6jgc/providers/Microsoft.Compute/virtualMachines/krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 Non-terminated Pods: (19 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age --------- ---- ------------ ---------- --------------- ------------- --- openshift-azure-logging mdsd-n7772 10m (0%) 200m (1%) 100Mi (0%) 1000Mi (0%) 6h22m openshift-cloud-controller-manager azure-cloud-node-manager-6r8cf 50m (0%) 0 (0%) 50Mi (0%) 0 (0%) 131m openshift-cluster-csi-drivers azure-disk-csi-driver-node-5ztpg 30m (0%) 0 (0%) 150Mi (0%) 0 (0%) 120m openshift-cluster-csi-drivers azure-file-csi-driver-node-27hc6 30m (0%) 0 (0%) 150Mi (0%) 0 (0%) 120m openshift-cluster-node-tuning-operator tuned-nsqx2 10m (0%) 0 (0%) 50Mi (0%) 0 (0%) 124m openshift-dns node-resolver-jtwsl 5m (0%) 0 (0%) 21Mi (0%) 0 (0%) 6h22m openshift-image-registry image-registry-544cff88b8-d7xrg 100m (0%) 0 (0%) 256Mi (0%) 0 (0%) 126m openshift-image-registry node-ca-nq9kg 10m (0%) 0 (0%) 10Mi (0%) 0 (0%) 125m openshift-ingress-canary ingress-canary-9sphw 10m (0%) 0 (0%) 20Mi (0%) 0 (0%) 125m openshift-ingress router-default-789c7cdb4-vdm2q 100m (0%) 0 (0%) 256Mi (0%) 0 (0%) 126m openshift-machine-config-operator machine-config-daemon-l4hch 40m (0%) 0 (0%) 100Mi (0%) 0 (0%) 6h22m openshift-monitoring alertmanager-main-0 9m (0%) 0 (0%) 120Mi (0%) 0 (0%) 125m openshift-monitoring node-exporter-497kt 9m (0%) 0 (0%) 47Mi (0%) 0 (0%) 126m openshift-monitoring prometheus-k8s-0 75m (0%) 0 (0%) 1104Mi (0%) 0 (0%) 123m openshift-multus multus-additional-cni-plugins-blbj6 10m (0%) 0 (0%) 10Mi (0%) 0 (0%) 110m openshift-multus multus-jdk8x 10m (0%) 0 (0%) 65Mi (0%) 0 (0%) 112m openshift-multus network-metrics-daemon-8qh4n 20m (0%) 0 (0%) 120Mi (0%) 0 (0%) 117m openshift-network-diagnostics network-check-target-5l9k2 10m (0%) 0 (0%) 15Mi (0%) 0 (0%) 109m openshift-ovn-kubernetes ovnkube-node-fqbj2 85m (0%) 0 (0%) 1650Mi (1%) 0 (0%) 54m Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 623m (3%) 200m (1%) memory 4294Mi (3%) 1000Mi (0%) ephemeral-storage 0 (0%) 0 (0%) hugepages-1Gi 0 (0%) 0 (0%) hugepages-2Mi 0 (0%) 0 (0%) attachable-volumes-azure-disk 0 0 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal RegisteredNode 138m node-controller Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 event: Registered Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 in Controller Normal RegisteredNode 137m node-controller Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 event: Registered Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 in Controller Normal RegisteredNode 135m node-controller Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 event: Registered Node krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-2-89l24 in Controller $ ========= ========= ========= ========= ========= ========= ========= $ oc get po -n openshift-multus -o wide | grep -i CrashLoop multus-4tpp8 0/1 CrashLoopBackOff 15 (4m35s ago) 119m 10.0.2.197 krishvoor-v5-252-jdfwd-worker-eastus1-6bxnh <none> <none> multus-5vrvh 0/1 CrashLoopBackOff 14 (2m13s ago) 121m 10.0.2.7 krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-3-2tdjz <none> <none> multus-hzsx2 0/1 CrashLoopBackOff 18 (2m23s ago) 119m 10.0.2.205 krishvoor-v5-252-jdfwd-worker-eastus1-cmv74 <none> <none> multus-l9cmc 0/1 CrashLoopBackOff 13 (2m51s ago) 119m 10.0.2.195 krishvoor-v5-252-jdfwd-worker-eastus1-rdj5p <none> <none> multus-p6447 0/1 CrashLoopBackOff 14 (3m48s ago) 125m 10.0.2.31 krishvoor-v5-252-jdfwd-worker-eastus1-zc9nx <none> <none> ========= ========= ========= ========= ========= ========= ========= $ oc describe po/multus-5vrvh -n openshift-multus Name: multus-5vrvh Namespace: openshift-multus Priority: 2000001000 Priority Class Name: system-node-critical Service Account: multus Node: krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-3-2tdjz/10.0.2.7 Start Time: Sat, 20 Jan 2024 17:11:44 +0530 Labels: app=multus component=network controller-revision-hash=56dc7685dc openshift.io/component=network pod-template-generation=2 type=infra Annotations: <none> Status: Running IP: 10.0.2.7 IPs: IP: 10.0.2.7 Controlled By: DaemonSet/multus Containers: kube-multus: Container ID: cri-o://8a780ff3fc132c40858c95e65502aff5ff51802db9b33829047c553d9ecfe9aa Image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d908f9fc9fadd371717dcbac7c7cde61a8fe5ea676f83fec5163848d25287fb0 Image ID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d908f9fc9fadd371717dcbac7c7cde61a8fe5ea676f83fec5163848d25287fb0 Port: <none> Host Port: <none> Command: /bin/bash -ec -- Args: /entrypoint/cnibincopy.sh; exec /usr/src/multus-cni/bin/multus-daemon State: Waiting Reason: CrashLoopBackOff Last State: Terminated Reason: Error Exit Code: 1 Started: Sat, 20 Jan 2024 19:10:27 +0530 Finished: Sat, 20 Jan 2024 19:11:12 +0530 Ready: False Restart Count: 14 Requests: cpu: 10m memory: 65Mi Environment: RHEL8_SOURCE_DIRECTORY: /usr/src/multus-cni/rhel8/bin/ RHEL9_SOURCE_DIRECTORY: /usr/src/multus-cni/rhel9/bin/ DEFAULT_SOURCE_DIRECTORY: /usr/src/multus-cni/bin/ KUBERNETES_SERVICE_PORT: 6443 KUBERNETES_SERVICE_HOST: api-int.ujio6jgc.eastus.aroapp.io MULTUS_NODE_NAME: (v1:spec.nodeName) K8S_NODE: (v1:spec.nodeName) Mounts: /entrypoint from cni-binary-copy (rw) /etc/cni/multus/certs from host-run-multus-certs (rw) /etc/cni/multus/net.d from multus-conf-dir (rw) /etc/cni/net.d/multus.d from multus-daemon-config (ro) /etc/kubernetes from etc-kubernetes (rw) /host/etc/cni/net.d from system-cni-dir (rw) /host/etc/os-release from os-release (rw) /host/opt/cni/bin from cnibin (rw) /host/run/multus from multus-socket-dir-parent (rw) /host/run/multus/cni/net.d from multus-cni-dir (rw) /hostroot from hostroot (rw) /run/k8s.cni.cncf.io from host-run-k8s-cni-cncf-io (rw) /run/netns from host-run-netns (rw) /var/lib/cni/bin from host-var-lib-cni-bin (rw) /var/lib/cni/multus from host-var-lib-cni-multus (rw) /var/lib/kubelet from host-var-lib-kubelet (rw) /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-2wkvc (ro) Conditions: Type Status Initialized True Ready False ContainersReady False PodScheduled True Volumes: system-cni-dir: Type: HostPath (bare host directory volume) Path: /etc/kubernetes/cni/net.d HostPathType: Directory multus-cni-dir: Type: HostPath (bare host directory volume) Path: /var/run/multus/cni/net.d HostPathType: Directory cnibin: Type: HostPath (bare host directory volume) Path: /var/lib/cni/bin HostPathType: Directory os-release: Type: HostPath (bare host directory volume) Path: /etc/os-release HostPathType: File cni-binary-copy: Type: ConfigMap (a volume populated by a ConfigMap) Name: cni-copy-resources Optional: false multus-socket-dir-parent: Type: HostPath (bare host directory volume) Path: /run/multus HostPathType: DirectoryOrCreate host-run-k8s-cni-cncf-io: Type: HostPath (bare host directory volume) Path: /run/k8s.cni.cncf.io HostPathType: host-run-netns: Type: HostPath (bare host directory volume) Path: /run/netns/ HostPathType: host-var-lib-cni-bin: Type: HostPath (bare host directory volume) Path: /var/lib/cni/bin HostPathType: host-var-lib-cni-multus: Type: HostPath (bare host directory volume) Path: /var/lib/cni/multus HostPathType: host-var-lib-kubelet: Type: HostPath (bare host directory volume) Path: /var/lib/kubelet HostPathType: hostroot: Type: HostPath (bare host directory volume) Path: / HostPathType: multus-conf-dir: Type: HostPath (bare host directory volume) Path: /etc/cni/multus/net.d HostPathType: multus-daemon-config: Type: ConfigMap (a volume populated by a ConfigMap) Name: multus-daemon-config Optional: false host-run-multus-certs: Type: HostPath (bare host directory volume) Path: /etc/cni/multus/certs HostPathType: etc-kubernetes: Type: HostPath (bare host directory volume) Path: /etc/kubernetes HostPathType: kube-api-access-2wkvc: Type: Projected (a volume that contains injected data from multiple sources) TokenExpirationSeconds: 3607 ConfigMapName: kube-root-ca.crt ConfigMapOptional: <nil> DownwardAPI: true ConfigMapName: openshift-service-ca.crt ConfigMapOptional: <nil> QoS Class: Burstable Node-Selectors: kubernetes.io/os=linux Tolerations: op=Exists Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 122m default-scheduler Successfully assigned openshift-multus/multus-5vrvh to krishvoor-v5-252-jdfwd-infra-aro-machinesets-eastus-3-2tdjz Normal Pulling 122m kubelet Pulling image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d908f9fc9fadd371717dcbac7c7cde61a8fe5ea676f83fec5163848d25287fb0" Normal Pulled 122m kubelet Successfully pulled image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d908f9fc9fadd371717dcbac7c7cde61a8fe5ea676f83fec5163848d25287fb0" in 30.920745864s (30.920756564s including waiting) Normal Created 51m (x6 over 122m) kubelet Created container kube-multus Normal Started 51m (x6 over 122m) kubelet Started container kube-multus Normal Pulled 51m (x5 over 114m) kubelet Container image "quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:d908f9fc9fadd371717dcbac7c7cde61a8fe5ea676f83fec5163848d25287fb0" already present on machine Warning BackOff 4m50s (x190 over 54m) kubelet Back-off restarting failed container kube-multus in pod multus-5vrvh_openshift-multus(bf3babb5-92cb-4fd4-97c5-9069e89774f6) $ ========= ========= ========= ========= ========= ========= ========= $ oc logs multus-5vrvh -n openshift-multus 2024-01-20T13:40:27+00:00 [cnibincopy] Successfully copied files in /usr/src/multus-cni/rhel9/bin/ to /host/opt/cni/bin/upgrade_2e6f5308-8942-4828-b5af-4743c07c54a9 2024-01-20T13:40:27+00:00 [cnibincopy] Successfully moved files in /host/opt/cni/bin/upgrade_2e6f5308-8942-4828-b5af-4743c07c54a9 to /host/opt/cni/bin/ 2024-01-20T13:40:27Z [verbose] multus-daemon started 2024-01-20T13:40:27Z [verbose] Readiness Indicator file check 2024-01-20T13:41:12Z [error] have you checked that your default network is ready? still waiting for readinessindicatorfile @ /host/run/multus/cni/net.d/10-ovn-kubernetes.conf. pollimmediate error: timed out waiting for the condition $
- relates to
-
OCPBUGS-27061 [ARO] OCP Upgrade at load (4.12.25 --> 4.13.24) Failed
- Closed