-
Bug
-
Resolution: Done-Errata
-
Normal
-
2.4.0, 2.5.0
-
None
Description of problem:
If ovirt-img fails to download the image for some reason, it doesn't propagate that error to the populator controller or doesn't log anything in events. Since the pod keeps restarting trying to download the image, it would be difficult to get the populator pod logs from the customers. It would be really lucky to get the populator pod in an error state in a must-gather since it keeps restarting. So in most cases, we have to ask oc logs -f from populator pods separately asking the customer to run it multiple times until the pod is running.
The populator controller logs only the below message:
I1006 05:27:31.278221 1 event.go:291] "Event occurred" object="new-nijin-cnv/e6943ec4-d349-44c4-b6b1-925b644691e5" kind="PersistentVolumeClaim" apiVersion="v1" type="Warning" reason="PopulatorFailed" message="Populator failed: "
It doesn't say why it failed. It gets that from pod.Status.Message that won't get populated when the ovirt-img fails to download the image.
I think the populator pod can redirect the ovirt-img errors to the /dev/termination-log so that populator controller can pick it up from `pod.Status.ContainerStatuses[0].State.Terminated.Message`?
Version-Release number of selected component (if applicable):
Migration Toolkit for Virtualization Operator 2.5.0
How reproducible:
100%
Steps to Reproduce:
- Stop ovirt-imageio service in RHV.
- Attempt a migration of VMs to OpenShift virtualization from this RHV.
- The ovirt-img download fails and it will continuously restart.
- The populator controller logs only logs "Populator failed" without telling why it failed.
Actual results:
Errors logged in populator pods are difficult to capture from customers.
Expected results:
The error should be captured either in events or in the logs of populator controller logs.
Additional info:
- clones
-
MTV-725 Errors logged in populator pods are difficult to capture from customers
- Closed
- links to
-
RHBA-2023:123512 MTV 2.5.3 Images