Uploaded image for project: 'OpenShift Bugs'
  1. OpenShift Bugs
  2. OCPBUGS-16322

one router pod was not ready after the cert-rotation test

XMLWordPrintable

    • No
    • 1
    • Sprint 239
    • 1
    • False
    • Hide

      None

      Show
      None

      Description of problem:

      "OCP-34247 OCP-38672 OCP-16872 OCP-15995 OCP-12088" failed during 4.8 cert-rotation e2e test(test run: 20230718-0048), rerun OCP-34247 and check the log, pod router-default-6f45f7cd9d-5d4fg was trying landing on a node, and it did not become ready within allowed time (RuntimeError)
      
      

      Version-Release number of selected component (if applicable):

      4.8.0-0.nightly-2023-07-14-130620

      How reproducible:

      100% on the cluster

      Steps to Reproduce:

      1. create an aws cluster with "private-templates/functionality-testing/aos-4_8/ipi-on-aws/versioned-installer-fips-ovn-ci" profile 
      
      2. updated it from 4.8.57-x86_64 to 4.8.0-0.nightly-2023-07-14-130620 
      
      3. regenerate certification for the whole cluster
      
      4. run the e2e test, some cases failed
      
      5. rerun OCP-34247, but failed, check the log in
      https://mastern-jenkins-csb-openshift-qe.apps.ocp-c1.prod.psi.redhat.com/job/ocp-common/job/Runner/830786/console
      6. the script created a project and check all router pods by "oc get pods --output=yaml -l ingresscontroller.operator.openshift.io/deployment-ingresscontroller\=default --kubeconfig=/home/jenkins/ws/workspace/ocp-common/Runner/workdir/ocp4_admin.kubeconfig -n openshift-ingress"
      
      at first router pod router-default-6f45f7cd9d-5d4fg was trying to land on ip-10-0-129-29.us-east-2.compute.internal, but TLS handshake error occur(http: TLS handshake error from 10.131.0.16:42238: write tcp 10.128.2.5:1936->10.131.0.16:42238:write: broken pipe), then tried to land on node ip-10-0-218-49.us-east-2.compute.internal which already had a router pod. I will attach the log later.
      
      name: router-default-6f45f7cd9d-5d4fg
      nodeName: ip-10-0-129-29.us-east-2.compute.internal
      
      name: router-default-6f45f7cd9d-5klth
      nodeName: ip-10-0-218-49.us-east-2.compute.internal
      
      name: router-default-6f45f7cd9d-m645f
      nodeName: ip-10-0-175-168.us-east-2.compute.internal
      
      name: router-default-6f45f7cd9d-5d4fg
      nodeName: ip-10-0-218-49.us-east-2.compute.internal

      Actual results:

      pod router-default-6f45f7cd9d-5d4fg not ready

      Expected results:

      pod router-default-6f45f7cd9d-5d4fg lands a correct node and becomes ready

      Additional info:

       

              mmasters1@redhat.com Miciah Masters
              shudili@redhat.com Shudi Li
              Shudi Li Shudi Li
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

                Created:
                Updated:
                Resolved: