-
Bug
-
Resolution: Done
-
Critical
-
None
-
4.12.0
-
Important
-
None
-
Proposed
-
False
-
See this comment for some updated information
—
Description of problem:
During IPI installation on IBM Cloud (x86_64), some of the worker machines have been seen to have no network connectivity during their initial bootup. Investigations were performed with IBM Cloud VPC to attempt to identify the issue, but in all appearances, all virtualization appears to be working.
Unfortunately due to this issue, no network traffic, no access to these worker machines is available to help identify the issue (Ignition is stuck without network traffic), so no SSH or console login is available to collect logs, or perform any testing on these machines.
The only content available is the console output, showing ignition is stuck due to the network issue.
Version-Release number of selected component (if applicable):
4.12.0
How reproducible:
About 60%
Steps to Reproduce:
1. Create an IPI cluster on IBM Cloud
2. Wait for the worker machines to be provisioned, causing IPI to fail waiting on machine-api operator
3. Check console of worker machines failing to report in to cluster (in this case 2 of 3 failed)
Actual results:
IPI creation failed waiting on machine-api operator to complete all worker node deployment
Expected results:
Successful IPI creation on IBM Cloud
Additional info:
As stated, investigation was performed by IBM Cloud VPC, but no further investigation could be performed since no access to these worker machines is available. Any further details that could be provided to help identify the issue would be helpful.
This appears to have become more prominent recently as well, causing concern for IBM Cloud's IPI GA support on the 4.12 release.
The only solution to restore network connectivity is rebooting the machine, which loses ignition bring up (I assume it must be triggered manually now), and in the case of IPI, isn't a great mitigation.
- blocks
-
OCPBUGS-3289 [IBMCloud] Worker machines unreachable during initial bring up
- Closed
- is cloned by
-
OCPBUGS-3289 [IBMCloud] Worker machines unreachable during initial bring up
- Closed
- relates to
-
OCPBUGS-626 [IBMCloud] Worker nodes stuck in Provisioning during cluster creation - fail to join cluster
- Closed
-
OCPBUGS-2892 Recovering from periodic provisioning failure on IBM Cloud VPC
- Closed
- links to