-
Bug
-
Resolution: Unresolved
-
Major
-
None
-
4.17.0
-
None
-
None
-
Rejected
-
False
-
Component Readiness has found a potential regression in the following test:
operator conditions network
Probability of significant regression: 99.42%
Sample (being evaluated) Release: 4.17
Start Time: 2024-09-04T00:00:00Z
End Time: 2024-09-11T23:59:59Z
Success Rate: 60.00%
Successes: 6
Failures: 4
Flakes: 0
Base (historical) Release: 4.16
Start Time: 2024-05-28T00:00:00Z
End Time: 2024-06-27T23:59:59Z
Success Rate: 100.00%
Successes: 22
Failures: 0
Flakes: 0
The bug is being opened against the component we see the regression for. However we see 4 out of 10 jobs have failed install with missing worker nodes.
Slack thread
Had a look at the agent-gather and found just the worker ones (as expected, since they did not join the bootstrap). From the journal the workers were able to fetch the ignition, ie ... set 09 21:19:27 worker-0 assisted-installer[2823]: time="2024-09-09T19:19:27Z" level=info msg="Getting ignition from https://192.168.111.5:22623/config/worker" ... and ready to reboot: ... set 09 21:20:31 worker-0 assisted-installer[2823]: time="2024-09-09T19:20:31Z" level=info msg="Uploading logs and reporting status before rebooting the node 2cc1856b-c4f5-4e3e-9117-128cf97e1d15 for cluster abe52fa7-9e7e-465e-a5c5-457ecb49bb70" ... But the reboot never happened, and they remaining stuck (thus not completing the joining procedure). The reason of the stuck it's not yet clear
- links to
-
RHEA-2024:6122 OpenShift Container Platform 4.18.z bug fix update
- mentioned on