-
Bug
-
Resolution: Done-Errata
-
Major
-
4.16.0
-
No
-
SDN Sprint 252
-
1
-
Proposed
-
False
-
Description of problem:
The ovn-ipsec-host pods are crashlooping on a 24 node cluster.
Version-Release number of selected component (if applicable):
4.16.0, master
How reproducible:
https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/50690/rehearse-50690-pull-ci-openshift-qe-ocp-qe-perfscale-ci-main-azure-4.15-nightly-x86-control-plane-ipsec-24nodes/1780216294851743744
Steps to Reproduce:
Running rehearse test for the PR https://github.com/openshift/release/pull/50690
Actual results:
CI lane fails at control-plane-ipsec-24nodes-ipi-install-install step. Seeing following errors from ipsec pod: 2024-04-16T14:18:01.158407293Z + counter=0 2024-04-16T14:18:01.158407293Z + '[' -f /etc/cni/net.d/10-ovn-kubernetes.conf ']' 2024-04-16T14:18:01.158512920Z ovnkube-node has configured node. 2024-04-16T14:18:01.158519623Z + echo 'ovnkube-node has configured node.' 2024-04-16T14:18:01.158519623Z + pgrep pluto 2024-04-16T14:18:01.166444142Z pluto is not running, enable the service and/or check system logs 2024-04-16T14:18:01.166465551Z + echo 'pluto is not running, enable the service and/or check system logs' 2024-04-16T14:18:01.166465551Z + exit 2
Expected results:
The step must pass and CI lane should succeed eventually.
Additional info:
The mcp status for the worker pool contains the following:
status: certExpirys: - bundle: KubeAPIServerServingCAData expiry: "2034-04-14T12:58:49Z" subject: CN=admin-kubeconfig-signer,OU=openshift - bundle: KubeAPIServerServingCAData expiry: "2024-04-17T12:58:51Z" subject: CN=kube-csr-signer_@1713274017 - bundle: KubeAPIServerServingCAData expiry: "2024-04-17T12:58:51Z" subject: CN=kubelet-signer,OU=openshift - bundle: KubeAPIServerServingCAData expiry: "2025-04-16T12:58:51Z" subject: CN=kube-apiserver-to-kubelet-signer,OU=openshift - bundle: KubeAPIServerServingCAData expiry: "2025-04-16T12:58:51Z" subject: CN=kube-control-plane-signer,OU=openshift - bundle: KubeAPIServerServingCAData expiry: "2034-04-14T12:58:50Z" subject: CN=kubelet-bootstrap-kubeconfig-signer,OU=openshift - bundle: KubeAPIServerServingCAData expiry: "2025-04-16T13:26:54Z" subject: CN=openshift-kube-apiserver-operator_node-system-admin-signer@1713274014 conditions: - lastTransitionTime: "2024-04-16T13:28:53Z" message: "" reason: "" status: "False" type: RenderDegraded - lastTransitionTime: "2024-04-16T13:34:52Z" message: "" reason: "" status: "False" type: Updated - lastTransitionTime: "2024-04-16T13:35:08Z" message: "" reason: "" status: "False" type: NodeDegraded - lastTransitionTime: "2024-04-16T13:35:08Z" message: "" reason: "" status: "False" type: Degraded - lastTransitionTime: "2024-04-16T13:34:52Z" message: All nodes are updating to MachineConfig rendered-worker-226a284eb61d46506202285ee1cf4688 reason: "" status: "True" type: Updating configuration: name: rendered-worker-95c2861c75a83c0523dcba922c3b9982 source: - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 98-worker-generated-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 97-worker-generated-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-worker-generated-registries - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-worker-container-runtime - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 01-worker-kubelet - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 80-ipsec-worker-extensions - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 99-worker-ssh - apiVersion: machineconfiguration.openshift.io/v1 kind: MachineConfig name: 00-worker degradedMachineCount: 0 machineCount: 24 observedGeneration: 140 readyMachineCount: 8 unavailableMachineCount: 1 updatedMachineCount: 8
- blocks
-
SDN-4313 add e2e ipsec upgrade ci lane to prow
- Closed
-
SDN-4384 revise all prow ipsec lanes
- Closed
- links to
-
RHEA-2024:0041 OpenShift Container Platform 4.16.z bug fix update