-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
None
-
False
-
-
False
-
-
The CLI must check if the pods are healthy after a while. It's not clear if Sonobuoy will timeout, but when the plugin image, for some reason, is not reachable, a better message on the CLI must be shown.
In the example bellow the image was not available (should not happen in production), but the CLI keeps trying to download it:
$ ./openshift-provider-cert-linux-amd64-process0 run -w INFO[2023-02-02T18:22:05-03:00] Ensuring proper node label for dedicated mode exists INFO[2023-02-02T18:22:06-03:00] Ensuring the tool will run in the privileged environment... INFO[2023-02-02T18:22:06-03:00] Created opct-scc-privileged ClusterRole INFO[2023-02-02T18:22:06-03:00] Created opct-scc-privileged ClusterRoleBinding INFO[2023-02-02T18:22:06-03:00] Running OpenShift Provider Certification Tool... INFO[2023-02-02T18:22:07-03:00] object already exists name=openshift-provider-certification namespace= resource=namespaces INFO[2023-02-02T18:22:07-03:00] create request issued name=sonobuoy-config-cm namespace=openshift-provider-certification resource=configmaps INFO[2023-02-02T18:22:07-03:00] create request issued name=sonobuoy-plugins-cm namespace=openshift-provider-certification resource=configmaps INFO[2023-02-02T18:22:07-03:00] create request issued name=sonobuoy namespace=openshift-provider-certification resource=pods INFO[2023-02-02T18:22:08-03:00] create request issued name=sonobuoy-aggregator namespace=openshift-provider-certification resource=services INFO[2023-02-02T18:22:08-03:00] Jobs scheduled! Waiting for resources be created... Thu, 02 Feb 2023 18:22:21 -03> Global Status: running JOB_NAME | STATUS | RESULTS | PROGRESS | MESSAGE 05-openshift-cluster-upgrade | running | | | 10-openshift-kube-conformance | running | | | 20-openshift-conformance-validated | running | | | 99-openshift-artifacts-collector | running | | | Thu, 02 Feb 2023 18:22:32 -03> Global Status: running JOB_NAME | STATUS | RESULTS | PROGRESS | MESSAGE 05-openshift-cluster-upgrade | running | | | 10-openshift-kube-conformance | running | | | 20-openshift-conformance-validated | running | | | 99-openshift-artifacts-collector | running | | | (...)
Pods:
$ oc get pods -n openshift-provider-certification -o wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES sonobuoy 1/1 Running 0 42s 10.131.2.5 ip-10-0-57-61.ec2.internal <none> <none> sonobuoy-05-openshift-cluster-upgrade-job-60d3274e82e04d28 1/3 ImagePullBackOff 0 39s 10.131.2.8 ip-10-0-57-61.ec2.internal <none> <none> sonobuoy-10-openshift-kube-conformance-job-b13d7ca45caf485b 1/3 ImagePullBackOff 0 39s 10.131.2.6 ip-10-0-57-61.ec2.internal <none> <none> sonobuoy-20-openshift-conformance-validated-job-85c75be8d1c94818 1/3 ImagePullBackOff 0 39s 10.131.2.9 ip-10-0-57-61.ec2.internal <none> <none> sonobuoy-99-openshift-artifacts-collector-job-fad442ab3e124582 1/3 ImagePullBackOff 0 39s 10.131.2.7 ip-10-0-57-61.ec2.internal <none> <none>
Pod:
$ oc describe pod -n openshift-provider-certification sonobuoy-05-openshift-cluster-upgrade-job-60d3274e82e04d28 Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 60s default-scheduler Successfully assigned openshift-provider-certification/sonobuoy-05-openshift-cluster-upgrade-job-60d3274e82e04d28 to ip-10-0-57-61.ec2.internal Normal AddedInterface 60s multus Add eth0 [10.131.2.8/23] from ovn-kubernetes Normal Created 59s kubelet Created container sonobuoy-worker Normal Started 59s kubelet Started container sonobuoy-worker Normal Pulled 59s kubelet Container image "quay.io/ocp-cert/sonobuoy:v0.56.10" already present on machine Warning Failed 43s (x2 over 59s) kubelet Error: ErrImagePull Normal Pulling 43s (x2 over 59s) kubelet Pulling image "quay.io/ocp-cert/openshift-tests-provider-cert:dev20230127205105" Warning Failed 43s (x2 over 59s) kubelet Failed to pull image "quay.io/ocp-cert/openshift-tests-provider-cert:dev20230127205105": rpc error: code = Unknown desc = reading manifest dev20230127205105 in quay.io/ocp-cert/openshift-tests-provider-cert: manifest unknown: manifest unknown Normal BackOff 32s (x5 over 59s) kubelet Back-off pulling image "quay.io/ocp-cert/openshift-tests-provider-cert:dev20230127205105" Warning Failed 32s (x5 over 59s) kubelet Error: ImagePullBackOff Normal BackOff 32s (x3 over 59s) kubelet Back-off pulling image "quay.io/ocp-cert/openshift-tests-provider-cert:dev20230127205105" Warning Failed 32s (x3 over 59s) kubelet Error: ImagePullBackOff
One idea could be displaying the pod status while there's no message(empty field) on the plugin payload.