Loading...

XML

Word

Printable

Type: Sub-task
Resolution: Done
Priority: Normal
Fix Version/s: None
Affects Version/s: None
Component/s: None
Labels:
None

Blocked:
False
Blocked Reason:

Hide

None

Show
None
Ready:
False

SFDC Cases Links:
SFDC Cases Open:
SFDC Cases Counter:

What happened?

Improve the error handler when the cluster was gone.

The CLI is reporting errors when communicating with kube-api correctly, and raising the information to end user[1], but at the end there's a wrong message telling the "pods are ready"[2].

[1] Error on CLI when the cluster was gone

Tue, 14 Jun 2022 20:13:29 -03> Global Status: running
JOB_NAME                       | STATUS     | RESULTS    | PROGRESS                  | MESSAGE                                           
openshift-conformance-validated | running    |            | 0/3250 (0 failures)       | status=waiting-for=openshift-kube-conformance=(0/-1/0)=[24/100]
openshift-kube-conformance     | running    |            | 344/345 (0 failures)      | status=running                                    
ERRO[2022-06-14T20:14:14-03:00] failed to get namespace openshift-provider-certification: Get "https://api.mrb14.devcluster.openshift.com:6443/api/v1/namespaces/openshift-provider-certification": http2: client connection lost
ERRO[2022-06-14T20:14:19-03:00] failed to get namespace openshift-provider-certification: Get "https://api.mrb14.devcluster.openshift.com:6443/api/v1/namespaces/openshift-provider-certification": dial tcp: lookup api.mrb14.devcluster.openshift.com on 10.11.5.19:53: no such host   (... x8)
ERRO[2022-06-14T20:15:29-03:00] failed to get namespace openshift-provider-certification: Get "https://api.mrb14.devcluster.openshift.com:6443/api/v1/namespaces/openshift-provider-certification": dial tcp: lookup api.mrb14.devcluster.openshift.com on 10.11.5.19:53: no such host

[2] Final message (unexpected) as the cluster was gone

INFO[2022-06-14T20:15:38-03:00] Sonobuoy pods are ready!

Note: the logs above was collected from dev environment which the cluster has a TTL of 10h, the cluster was deleted by pruner while the certification tool was running.

What did you expect to happen?

1) We need to improve the last message when the communication was permanently lost from server. The current message is not correct.

2) We could print an final message after that connection times out (retries exhausted)

Assignee:: Robert Bost

Reporter:: Marco Braga

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Created:: 2022/06/15 12:42 PM

Updated:: 2023/02/10 8:06 PM

Resolved:: 2022/06/24 2:47 PM

Details

Description

Attachments

Easy Agile Planning Poker

Activity

People

Dates

PagerDuty