-
Bug
-
Resolution: Done
-
Normal
-
None
-
None
-
None
job link
test output snippet:
{ 1 nodes took over 5m0s to stage OSUpdate: node/ip-10-0-236-143.us-east-2.compute.internal OSUpdateStarted at 2022-04-29T17:39:54Z, did not make it to OSUpdateStaged}
quickly using search.ci, this failure does seem to be more frequent in OVN jobs compared to SDN jobs.
One quick clue might be visually seen if you expand the Intervals - everything_20220429-162545 graph. If you scroll
down to the bottom of that graph you will see some blue bars which are illustrating the nodes being rebooted as part
of the upgrade. The node in question node/ip-10-0-236-143.us-east-2.compute.internal is missing the medium blue
bar that indicates that kubelet has started. It should be easy to see if you compare to the other 5 nodes in that same
section. I did not look any deeper than that, although the first place I'd check next is the journal file for that node that
is in the artifacts.
link to this job's testgrid for reference.