Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-32347

ovn-ipsec-host pods are crashlooping

XMLWordPrintable

    • No
    • SDN Sprint 252
    • 1
    • Proposed
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      The ovn-ipsec-host pods are crashlooping on a 24 node cluster.  

      Version-Release number of selected component (if applicable):

       4.16.0, master   

      How reproducible:

      https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_release/50690/rehearse-50690-pull-ci-openshift-qe-ocp-qe-perfscale-ci-main-azure-4.15-nightly-x86-control-plane-ipsec-24nodes/1780216294851743744    

      Steps to Reproduce:

      Running rehearse test for the PR https://github.com/openshift/release/pull/50690

      Actual results:

      CI lane fails at control-plane-ipsec-24nodes-ipi-install-install step.
      
      Seeing following errors from ipsec pod:
      
      2024-04-16T14:18:01.158407293Z + counter=0
      2024-04-16T14:18:01.158407293Z + '[' -f /etc/cni/net.d/10-ovn-kubernetes.conf ']'
      2024-04-16T14:18:01.158512920Z ovnkube-node has configured node.
      2024-04-16T14:18:01.158519623Z + echo 'ovnkube-node has configured node.'
      2024-04-16T14:18:01.158519623Z + pgrep pluto
      2024-04-16T14:18:01.166444142Z pluto is not running, enable the service and/or check system logs
      2024-04-16T14:18:01.166465551Z + echo 'pluto is not running, enable the service and/or check system logs'
      2024-04-16T14:18:01.166465551Z + exit 2
      

      Expected results:

      The step must pass and CI lane should succeed eventually.    

      Additional info:

      The mcp status for the worker pool contains the following:
      status:
        certExpirys:
        - bundle: KubeAPIServerServingCAData
          expiry: "2034-04-14T12:58:49Z"
          subject: CN=admin-kubeconfig-signer,OU=openshift
        - bundle: KubeAPIServerServingCAData
          expiry: "2024-04-17T12:58:51Z"
          subject: CN=kube-csr-signer_@1713274017
        - bundle: KubeAPIServerServingCAData
          expiry: "2024-04-17T12:58:51Z"
          subject: CN=kubelet-signer,OU=openshift
        - bundle: KubeAPIServerServingCAData
          expiry: "2025-04-16T12:58:51Z"
          subject: CN=kube-apiserver-to-kubelet-signer,OU=openshift
        - bundle: KubeAPIServerServingCAData
          expiry: "2025-04-16T12:58:51Z"
          subject: CN=kube-control-plane-signer,OU=openshift
        - bundle: KubeAPIServerServingCAData
          expiry: "2034-04-14T12:58:50Z"
          subject: CN=kubelet-bootstrap-kubeconfig-signer,OU=openshift
        - bundle: KubeAPIServerServingCAData
          expiry: "2025-04-16T13:26:54Z"
          subject: CN=openshift-kube-apiserver-operator_node-system-admin-signer@1713274014
        conditions:
        - lastTransitionTime: "2024-04-16T13:28:53Z"
          message: ""
          reason: ""
          status: "False"
          type: RenderDegraded
        - lastTransitionTime: "2024-04-16T13:34:52Z"
          message: ""
          reason: ""
          status: "False"
          type: Updated
        - lastTransitionTime: "2024-04-16T13:35:08Z"
          message: ""
          reason: ""
          status: "False"
          type: NodeDegraded
        - lastTransitionTime: "2024-04-16T13:35:08Z"
          message: ""
          reason: ""
          status: "False"
          type: Degraded
        - lastTransitionTime: "2024-04-16T13:34:52Z"
          message: All nodes are updating to MachineConfig rendered-worker-226a284eb61d46506202285ee1cf4688
          reason: ""
          status: "True"
          type: Updating
        configuration:
          name: rendered-worker-95c2861c75a83c0523dcba922c3b9982
          source:
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 98-worker-generated-kubelet
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 97-worker-generated-kubelet
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 99-worker-generated-registries
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 01-worker-container-runtime
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 01-worker-kubelet
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 80-ipsec-worker-extensions
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 99-worker-ssh
          - apiVersion: machineconfiguration.openshift.io/v1
            kind: MachineConfig
            name: 00-worker
        degradedMachineCount: 0
        machineCount: 24
        observedGeneration: 140
        readyMachineCount: 8
        unavailableMachineCount: 1
        updatedMachineCount: 8

            pepalani@redhat.com Periyasamy Palanisamy
            pepalani@redhat.com Periyasamy Palanisamy
            Huiran Wang Huiran Wang
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

              Created:
              Updated:
              Resolved: