Loading...

XML

Word

Printable

Type: Bug
Resolution: Cannot Reproduce
Priority: Major
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
- ci-issues

Blocked:
False
Blocked Reason:
None
Ready:
False
[QE] How to address?:
---
Intelligence Requested:
Market:

Sprint:
SDN Sprint 230
Cost of Delay:
0
WSJF:
0

SFDC Cases Links:
SFDC Cases Counter:
SFDC Cases Open:

The following error is happening frequently in vSphere CI jobs. Reference to a recent failed job: 1591151696636022784

error: timed out waiting for the condition on clusteroperators/network
{"component":"entrypoint","error":"wrapped process failed: exit status 1","file":"k8s.io/test-infra/prow/entrypoint/run.go:79","func":"k8s.io/test-infra/prow/entrypoint.Options.Run","level":"error","msg":"Error executing test process","severity":"error","time":"2022-11-11T21:57:34Z"}
error: failed to execute wrapped command: exit status 1 
INFO[2022-11-11T21:57:36Z] Step vsphere-e2e-operator-ipi-install-vsphere-registry failed after 39m0s. 
INFO[2022-11-11T21:57:36Z] Step phase pre failed after 1h29m10s.

After an initial triage in a Slack thread with dcbw@redhat.com seems like a network-check daemonset issue, where there are one/more network check target pods unavailable.

            "status": {
                "currentNumberScheduled": 6,
                "desiredNumberScheduled": 6,
                "numberAvailable": 5,
                "numberMisscheduled": 0,
                "numberReady": 5,
                "numberUnavailable": 1,
                "observedGeneration": 1,
                "updatedNumberScheduled": 6
            }

The node journal on the machine that is supposed to have that pod shows the following log:

Nov 01 16:45:26.044455 ci-op-h4g42fh5-23f97-4zn6r-master-0 kubenswrapper[2991]: I1101 16:45:26.044256    2991 prober.go:121] "Probe failed" probeType="Readiness" pod="openshift-network-diagnostics/network-check-target-rdrpd" podUID=c0b15643-55e8-450e-bb70-ad76391cd012 containerName="network-check-target-container" probeResult=failure output="Get \"http://10.128.0.3:8080/\": context deadline exceeded (Client.Timeout exceeded while awaiting headers)"

links to

openshift/windows-machine-config-operator#1373: WIP: Please ignore

Assignee:: Flavio Fernandes (Inactive)

Reporter:: Jose Valdes

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2022/11/23 4:21 PM

Updated:: 2023/01/06 9:53 PM

Resolved:: 2023/01/06 9:53 PM

Details

Description

Attachments

Issue Links

Easy Agile Planning Poker

Activity

People

Dates