Uploaded image for project: 'OpenShift SDN'
  1. OpenShift SDN
  2. SDN-3603

Timed out waiting for the condition on clusteroperators/network in vSphere CI

    XMLWordPrintable

Details

    • Bug
    • Resolution: Cannot Reproduce
    • Major
    • None
    • None
    • None
    • False
    • None
    • False
    • ---
    • SDN Sprint 230
    • 0
    • 0

    Description

      The following error is happening frequently in vSphere CI jobs. Reference to a recent failed job: 1591151696636022784

      error: timed out waiting for the condition on clusteroperators/network
      {"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"k8s.io/test-infra/prow/entrypoint/run.go:79","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2022-11-11T21:57:34Z"}
      error: failed to execute wrapped command: exit status 1 
      INFO[2022-11-11T21:57:36Z] Step vsphere-e2e-operator-ipi-install-vsphere-registry failed after 39m0s. 
      INFO[2022-11-11T21:57:36Z] Step phase pre failed after 1h29m10s.    
      

       

      After an initial triage in a Slack thread with dcbw@redhat.com  seems like a network-check daemonset issue, where there are one/more network check target pods unavailable.

                  "status": {
                      "currentNumberScheduled": 6,
                      "desiredNumberScheduled": 6,
                      "numberAvailable": 5,
                      "numberMisscheduled": 0,
                      "numberReady": 5,
                      "numberUnavailable": 1,
                      "observedGeneration": 1,
                      "updatedNumberScheduled": 6
                  }
      

      The node journal on the machine that is supposed to have that pod shows the following log:

      Nov 01 16:45:26.044455 ci-op-h4g42fh5-23f97-4zn6r-master-0 kubenswrapper[2991]: I1101 16:45:26.044256    2991 prober.go:121] "Probe failed" probeType="Readiness" pod="openshift-network-diagnostics/network-check-target-rdrpd" podUID=c0b15643-55e8-450e-bb70-ad76391cd012 containerName="network-check-target-container" probeResult=failure output="Get \"http://10.128.0.3:8080/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"
      

       

       

       

      Attachments

        Activity

          People

            ffernand@redhat.com Flavio Fernandes (Inactive)
            jvaldes@redhat.com Jose Valdes
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: