-
Spike
-
Resolution: Unresolved
-
Major
-
None
-
4.16
-
False
-
None
-
False
-
-
Description of problem:
vSphere Dualstack (IPv4 Primary) cluster failed to install.
Looking at the gather bootstrap logs we are using one of the IPv6 address to contact the node instead of the either using the IPv4 address , or optimally looping and using all the provide host addresses in turn.
platform: vsphere: apiVIPs: - 10.94.146.130 - fd65:a1a8:60ad:1234::4 ingressVIPs: - 10.94.146.131 - fd65:a1a8:60ad:1234::5 networking: networkType: OVNKubernetes machineNetwork: - cidr: 10.94.146.128/25 - cidr: fd65:a1a8:60ad:1234::/64 clusterNetwork: - cidr: 10.128.0.0/14 hostPrefix: 23 - cidr: fd65:10:128::/56 hostPrefix: 64 serviceNetwork: - 172.30.0.0/16 - fd65:172:16::/112
level=info msg=openshift-install gather bootstrap --help level=error msg=Bootstrap failed to complete: timed out waiting for the condition level=error msg=Failed to wait for bootstrapping to complete. This error usually happens when there is a problem with control plane hosts that prevents the control plane operators from creating the control plane. level=info msg=Pulling Cluster API artifacts level=info msg=Skipping VM console logs gather: no gather methods registered for "vsphere" level=info msg=Pulling debug logs from the bootstrap machine level=info msg=Failed to gather bootstrap logs: failed to create SSH client: dial tcp [fd65:a1a8:60ad:1234:3d20:f47c:6df4:21e0]:22: connect: network is unreachable
# yq '.status.addresses' ./launch/ipi-install-install/artifacts/clusterapi_output/Machine-openshift-cluster-api-guests-ci-ln-dpbx1jk-c1627-mshkd-bootstrap.yaml
- type: ExternalIP
address: 10.94.146.143
- type: ExternalIP
address: fd65:a1a8:60ad:1234::22
- type: ExternalIP
address: fd65:a1a8:60ad:1234:3d20:f47c:6df4:21e0
- type: InternalDNS
address: ci-ln-dpbx1jk-c1627-mshkd-bootstrap
Version-Release number of selected component (if applicable):
4.16.0-0.nightly-2024-07-02-211018
How reproducible:
Intermittent. We have no control over DHCPv4 or DHCPv6 timings or IP address ordering.
Steps to Reproduce:
1. Clusterbot: launch 4.16.0-0.nightly vsphere,dualstack 2.
Actual results:
Cluster fails to install, bootstrap can't ssh to IPv6
level=info msg=Failed to gather bootstrap logs: failed to create SSH client: dial tcp [fd65:a1a8:60ad:1234:3d20:f47c:6df4:21e0]:22: connect: network is unreachable
Expected results:
We should probably try all the Host ExternalIPs until success.
Otherwise could follow the Dualstack Primary IP family rules and try IPv4 or IPv6 first.
Additional info:
A lot of code incorrectly assumes single stack and that there is only a single IP for a Host.
We should assume there exists an IPv4 and IPv6 address for every host, until IPv4 is retired.
- is related to
-
OCPBUGS-37427 When vsphere bootstrap fails ipv6 address is used over ipv4 in gather
- Closed